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ENVIRONMENTAL STRESS TOLERANCE GENES 

RELATED APPLICATION INFORMATION 

The present invendoTclaims the benefit from US Provisional Patent Appl.cat.on Senal 
No, 60/166,228 filed November 17, 1999 and 60/197,899 filed April 17, 2000 and "Plant Tra.t 
Modification ni" filed August 22, 2000. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the present 

invention pertains to compositions and methods for phenotypically modifying a plant. 

BACKGROUND OF THE INVENTION 

Transcription factors can modulate gene expression, either increasing or 
decreasing (inducing or repressing) the rate of transcription. This modulation results in 
differential levels of gene expression at various developmental stages, in different tissues and cell 
types, and in response to different exogenous (e.g., environmental) and endogenous stimuli 
1 5 throughout the life cycle of the organism. 

Because transcription factors are key controlling elements of biological 
pathways, altering the expression levels of one or more transcription factors can change entire 
biological pathways in an organism. For example, manipulation of the levels of selected 
transcription factors may result in increased expression of economically useful proteins or 
20 metabolic chemicals in plants or to improve other agriculturally relevant charactenst.es. 

Conversely, blocked or reduced expression of a transcription factor may reduce biosynthesis of 
unwanted compounds or remove an undesirable trait. Therefore, manipulating transcription 
factor levels in a plant offers tremendous potential in agricultural biotechnology for modifying a 
plant's traits. 

25 The present invention provides novel transcription factors useful for modifying a 

plant's phenotype in desirable ways, such as modifying a plant's environmental stress tolerance. 

SUMMARY OF THE INVENTION 
In a first aspect, the invention relates to a recombinant polynucleotide compnsing 
a nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence encoding a 
30 polypeptide comprising a sequence selected from SEQ ID Nos. 2N, where N-l-27, or a 

complementary nucleotide sequence thereof; (b) a nucleotide sequence encoding a polypeptide 
comprising a conservatively substituted variant of a polypeptide of (a); (c) a nucleotide sequence 
comprising a sequence selected from those of SEQ ID Nos. 2N-1 , where N-l-27. or a 
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complementary nucleotide sequence thereof; (d) a nucleotide sequence comprising silent 
substitutions in a nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under 
stringent conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (0 a nucleotide sequence comprising at least 15 consecutive nucleotides of 
5 a sequence of any of (a)-(e); (g) a nucleotide sequence comprising a subsequence or fragment of 
any of (a)-(f), which subsequence or fragment encodes a polypeptide having a biological activity 
that modifies a plant's environmental stress tolerance; (h) a nucleotide sequence having at least 
30% sequence identity to a nucleotide sequence of any of (a)-(g); (i) a nucleotide sequence 
having at least 60% identity sequence identity to a nucleotide sequence of any of (a)-(g); (j) a 

1 0 nucleotide sequence which encodes a polypeptide having at least 30% identity sequence identity 
to a polypeptide of SEQ ID Nos. 2N, where N=l-27; (k) a nucleotide sequence which encodes a 
polypeptide having at least 60% identity sequence identity to a polypeptide of SEQ ED Nos. 2N, 
where N=l-27; and (1) a nucleotide sequence which encodes a conserved domain of a polypeptide 
having at least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 

1 5 2N, where N=l-27. The recombinant polynucleotide may further comprise a constitutive, 

inducible, or tissue-active promoter operably linked to the nucleotide sequence. The invention 
also relates to compositions comprising at least two of the above described polynucleotides. 

In a second aspect, the invention is an isolated or recombinant polypeptide 
comprising a subsequence of at least about 15 contiguous amino acids encoded by the 

20 recombinant or isolated polynucleotide described above. 

In another aspect, the invention is a transgenic plant comprising one or more of the above 
described recombinant polynucleotides. In yet another aspect, the invention is a plant with 
altered expression levels of a polynucleotide described above or a plant with altered expression or 
activity levels of an above described polypeptide. Further^ the invention may be a plant lacking a 

25 nucleotide sequence encoding a polypeptide comprising a sequence selected from SEQ ID Nos. 
2N, where N=l-27. 

The plant may be a soybean, wheat, com, potato, cotton, rice, oilseed rape, 
sunflower, alfalfa, sugarcane, turf, banana, blackberry, blueberry, strawberry, raspberry, 
cantaloupe, carrot, cauliflower, coffee, cucumber, eggplant, grapes, honeydew, lettuce, mango, 

30 melon, onion, papaya, peas, peppers, pineapple, spinach, squash, sweet corn, tobacco, tomato, 
watermelon, rosaceous fruits, or vegetable brassicas plant. 

In a further aspect, the invention relates to a cloning or expression vector 
comprising the isolated or recombinant polynucleotide described above or cells comprising the 
cloning or expression vector. 
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In yet a further aspect, the invention relates to a composition produced by 
incubating a polynucleotide of the invention with a nuclease, a restriction enzyme, a polymerase; 

a polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having 

5 improved environmental stress tolerance. The method comprises altering the expression of an 
isolated or recombinant polynucleotide of the invention or altering the expression or activity of a 
polypeptide of the invention in a plant to produce a modified plant, and selecting the modified 
plant for modified environmental stress tolerance. 

In another aspect, the invention relates to a method of identifying a factor that is 
10 modulated by or interacts with a polypeptide encoded by a polynucleotide of the invention. The 
method comprises expressing a polypeptide encoded by the polynucleotide in a plant; and 
identifying at least one factor that is modulated by or interacts with the polypeptide. In one 
embodiment the method for identifying modulating or interacting factors is by detecting binding 
by the polypeptide to a promoter sequence, or by detecting interactions between an additional 
15 protein and the polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that 
modulates activity or expression of a polynucleotide or polypeptide of interest. The method 
comprises placing the molecule in contact with a plant comprising the polynucleotide or 
20 polypeptide encoded by the polynucleotide of the invention and monitoring one or more of the 
expression level of the polynucleotide in the plant, the expression level of the polypeptide in the 
plant, and modulation of an activity of the polypeptide in the plant. 

In yet another aspect, the invention relates to an integrated system, computer or 
computer readable medium comprising one or more character strings corresponding to a 
25 polynucleotide of the invention, or to a polypeptide encoded by the polynucleotide. The 

integrated system, computer or computer readable medium may comprise a link between one or 
more sequence strings to a modified plant environmental stress tolerance phenotype. 

In yet another aspect, the invention is a method for identifying a sequence similar 
or homologous to one or more polynucleotides of the invention, or one or more polypeptides 
30 encoded by the polynucleotides. The method comprises providing a sequence database; and, 
querying the sequence database with one or more target sequences corresponding to the one or 
more polynucleotides or to the one or more polypeptides to identify one or more sequence 
members of the database that display sequence similarity or homology to one or more of the one 
or more target sequences. 
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The method may farther comprise of linking the one or more of the 
polynucleotides of the invention, or encoded polypeptides, to a modified plant environmental 
stress tolerance phenotype. 

BRIEF DESCRIPTION OF THE DRAWINGS 

5 Figure 1 provides a table of exemplary polynucleotide and polypeptide sequences of the 

invention. The table includes from left to right for each sequence: the SEQ ID No., the internal 
code reference number (GID), whether the sequence is a polynucleotide or polypeptide sequence, 
and identification of any conserved domains for the polypeptide sequences. 

Figure 2 provides a table of exemplary sequences that are homologous to other 
10 sequences provided in the Sequence Listing and that are derived from Arabidopsis thaliana. The 
table includes from left to right: the SEQ ID No., the internal code reference number (GID), 
identification of the homologous sequence, whether the sequence is a polynucleotide or 
polypeptide sequence, and identification of any conserved domains for the polypeptide 
sequences. 

1 5 Figure 3 provides a table of exemplary sequences that are homologous to the sequences 

provided in Figures 1 and 2 and that are derived from plants other than Arabidopsis thaliana. The 
table includes from left to right: the SEQ ID No., the internal code reference number (GID), the 
unique GenBank sequence ID No. (NID), the probability that the comparison was generated by 
chance (P-value), and the species from which the homologous gene was identified. 

20 

DETAILED DESCRIPTION 

The present invention relates to polynucleotides and polypeptides, e.g. for 

modifying phenotypes of plants. 

In particular, the polynucleotides or polypeptides are useful for modifying traits 
25 associated with a plant's environmental stress tolerance when the expression levels of the 

polynucleotides or expression levels or activity levels of the polypeptides are altered. 

Specifically, the polynucleotides and polypeptides are useful for modifying traits associated with 

a plant's environmental stress tolerance, such as freezing, chilling, heat, drought, water saturation, 

salt, photoconditions, radiation and ozone, or the like. Plants with altered expression of the 
30 polynucleotides or polypeptides of the invention are more tolerant to these environmental stresses 

compared with plants without altered expression levels. 

The polynucleotides of the invention encode plant transcription factors. The 

plant transcription factors are derived, e.g., from Arabidopsis thaliana and can belong, e.g., to one 
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or more of the following transcription factor families: the AP2 (APETALA2) domain 
transcription factor family (Riechmann and Meyerowitz (1998) JJ^te 379:633-646); the 
MYB transcription factor family (Martin and Paz-Ares (1997) JrejidsJ^ 13:67-73); the 
MADS domain transcription factor family (Riechmann and Meyerowitz (1997) UteLQML 
5 378 1079-1 101); the WRKY protein family (Uhiguro and Nakamura (1994) MoLGeaGenet 
244 563-571); the ankyrin-repeat protein family (Zhang et al. (1992) PlantCell 4:1575-1588); 
them 1S ce,laneousprotein(NflSC)farnily(Kimetal.(1997) Plar^ 11:1237-1251); the zinc 
finger protein (Z) family (Klug and Schwabe (1995) FASEBL 9: 597-604); the homeobox (HB) 
protein fafly (Duboule (1994) G^^^^^ 0*"» 
10 tteCAAT-e.ementbmdmgprot^^^ 3 :ll«W17ft 

the squamosa promoter binding proteins (SPB) (Klein et al. (1996) Mol^n^t 1996 50:7- 
16)- the NAM protein family; the IAA/AUX proteins (Rouse et al. (1998) Science 279:1371- 
, 373V the HLH/MYC protein family (Litflewood et al. (1 994) PjotProfile 1 :639-709); the 

DNA^indingprote^ 
15 M y of transcription fac^^^^ 

P-binding factor) family (da Costa e Silva et al. (1993) EkntL 4:125-135); and the golden 
protein (GLD) family (Hall et al. (1 998) PJantCeU 10:925-936). 

In addition to methods for modifying a plant phenotype by employmg one or 
m0 re polynucleotides and polypeptides of the invention described herein, the polynucleotides 
20 andpolypeptidesoftheinventionhaveavarietyofadditionaluse, These uses include then- use 
in the recombinant production (i.e. expression) of proteins; as regulators of plant gene expression, 
as diagnostic probes for the presence of complementary or partially complementary nucleic acids 
(including for detection of natural coding nucleic acids); as substrates for further reactions, e.g., 
m u t ationreactions ( PCRreactions,or*elike,ofassubstratesforcloninge.g. >1 nclud 1 ng 

25 • digestion or ligation reactions, *" 
transcription factors. 

DEFINITIONS 

A "polynucleotide" is a nucleic acid sequence comprising a plurality ol 
polymerized nucleotide residues, e.g., at least about 15 consecutive polymerized nucleotide 
30 residues, optionally at least about 30 consecutive nucleotides, at least about 50 consecutive 
nucleotides. In many instances, a polynucleotide comprises a nucleotide sequence encoding a 
polypeptide (or protein) or a domain or fragment thereof. Additionally, the polynucleotide may 
comprise a promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation 
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site, 5* or 3' untranslated regions, a reporter gene, a selectable marker, or the like. The 
polynucleotide can be single stranded or double stranded DNA or RNA. The polynucleotide 
optionally comprises modified bases or a modified backbone. The polynucleotide can be, e.g., 
genomic DNA or RNA, a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, 
5 a synthetic DNA or RNA, or the like. The polynucleotide can comprise a sequence in either 
sense or antisense orientations. 

A "recombinant polynucleotide" is a polynucleotide that is not in its native state, 
e.g., the polynucleotide comprises a nucleotide sequence not found in nature, or the 
polynucleotide is in a context other than that in which it is naturally found, e.g., separated from 
1 0 nucleotide sequences with which it typically is in proximity in nature, or adjacent (or contiguous 
with) nucleotide sequences with which it typically is not in proximity. For example, the sequence 
at issue can be cloned into a vector, or otherwise recombined with one or more additional nucleic 
acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring or 

15 recombinant, that is present outside the cell in which it is typically found in nature, whether 
purified or not Optionally, an isolated polynucleotide is subject to one or more enrichment or 
purification procedures, e.g., cell lysis, extraction, centrifugation, precipitation, or the like. 

A "recombinant polypeptide" is a polypeptide produced by translation of a 
recombinant polynucleotide. An "isolated polypeptide " whether a naturally occurring or a 

20 recombinant polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural 
state in a wild type cell, e.g., more than about 5% enriched, more than about 10% enriched, or 
more than about 20%, or more than about 50%, or more, enriched, i.e., alternatively denoted: 
105%, 110%, 120%, 150% or more, enriched relative to wild type standardized at 100%. Such an 
enrichment is not the result of a natural response of a wild type plant. Alternatively, or 

25 additionally, the isolated polypeptide is separated from other cellular components with which it is 
typically associated, e.g., by any of the various protein purification methods herein. 

The term "transgenic plant" refers to a plant that contains genetic material, not 
found in a wild type plant of the same species, variety or cultivar. The genetic material may 
include a transgene, an insertional mutagenesis event (such as by transposon or T-DNA 

30 insertional mutagenesis), an activation tagging sequence, a mutated sequence, a homologous 
recombination event or a sequence modified by chimeraplasty. Typically, the foreign genetic 
material has been introduced into the plant by human manipulation. 

A transgenic plant may contain an expression vector or cassette. The expression 
cassette typically comprises a polypeptide-encoding sequence operably linked (i.e., under 
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regulatory control of) to appropriate inducible or constitutive regulatory sequences that allow for 
the expression of polypeptide. The expression cassette can be introduced into a plant by 
transformation or by breeding after transformation of a parent plant. A plant refers to a whole 
plant as well as to a plant part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any 
5 other plant material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 

The phrase "ectopically expression or altered expression" in reference to a 
polynucleotide indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is 
different from the expression pattern in a wild type plant or a reference plant of the same species. 
10 For example, the polynucleotide or polypeptide is expressed in a cell or tissue type other than a 
cell or tissue type in which the sequence is expressed in the wild type plant, or by expression at a 
time other than at the time the sequence is expressed in the wild type plant, or by a response to 
different inducible agents, such as hormones or environmental signals, or at different expression 
levels (either higher or lower) compared with those found in a wild type plant. The term also 
1 5 refers to altered expression patterns that are produced by lowering the levels of expression to 
below the detection level or completely abolishing expression. The resulting expression pattern 
can be transient or stable, constitutive or inducible. In reference to a polypeptide, the term 
"ectopic expression or altered expression" further may relate to altered activity levels resulting 
from the interactions of the polypeptides with exogenous or endogenous modulators or from 
20 interactions with factors or as a result of the chemical modification of the polypeptides. 

The term "fragment" or "domain," with respect to a polypeptide, refers to a 
subsequence of the polypeptide. In some cases, the fragment or domain, is a subsequence of the 
polypeptide which performs at least one biological function of the intact polypeptide in 
substantially the same manner, or to a similar extent, as does the intact polypeptide. For example, 
25 a polypeptide fragment can comprise a recognizable structural motif or functional domain such as 
a DN A binding domain that binds to a DNA promoter region, an activation domain or a domain 
for protein-protein interactions. Fragments can vary in size from as few as 6 amino acids to the 
full length of the intact polypeptide, but are preferably at least about 30 amino acids in length and 
more preferably at least about 60 amino acids in length. In reference to a nucleotide sequence, "a 
30 fragment" refers to any subsequence of a polynucleotide, typically, of at least consecutive about 
1 5 nucleotides, preferably at least about 30 nucleotides, more preferably at least about 50, of any 
of the sequences provided herein. 

The term "trait" refers to a physiological, morphological, biochemical or physical 
characteristic of a plant or particular plant material or cell. In some instances, this characteristic 



7 



WO 01/36598 



PCT/US00/31458 



is visible to the human eye, such as seed or plant size, or can be measured by available 
biochemical techniques, such as the protein, starch or oil content of seed or leaves or by the 
observation of the expression level of genes, e.g., by employing Northern analysis, RT-PCR, 
microarray gene expression assays or reporter gene expression systems, or by agricultural 
5 observations such as stress tolerance, yield or pathogen tolerance. 

"Trait modification" refers to a detectable difference in a characteristic in a plant 
ectopically expressing a polynucleotide or polypeptide of the present invention relative to a plant 
not doing so, such as a wild type plant. In some cases, the trait modification can be evaluated 
quantitatively. For example, the trait modification can entail at least about a 2% increase or 

10 decrease in an observed trait (difference), at least a 5% difference, at least about a 10% 

difference, at least about a 20% difference, at least about a 30%, at least about a 50%, at least 
about a 70%, or at least about a 100%, or an even greater difference. It is known that there can be 
a natural variation in the modified trait. Therefore, the trait modification observed entails a 
change of the normal distribution of the trait in the plants compared with the distribution 

1 5 observed in wild type plant. 

Trait modifications of particular interest include those to seed ( such as embryo 
or endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: enhanced 
tolerance to environmental conditions including freezing, chilling, heat, drought, water saturation, 
radiation and ozone; improved tolerance to microbial, fungal or viral diseases; improved 

20 tolerance to pest infestations, including nematodes, mollicutes, parasitic higher plants or the like; 
decreased herbicide sensitivity; improved tolerance of heavy metals or enhanced ability to take up 
heavy metals; improved growth under poor photoconditions (e.g., low light and/or short day 
length), or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the production of 

25 taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax monomers, anti-oxidants, 
amino acids, lignins, cellulose, tannins, prenyllipids (such as chlorophylls and carotenoids), 
glucosinolates, and terpenoids, enhanced or compositionally altered protein or oil production 
(especially in seeds), or modified sugar (insoluble or soluble) and/or starch composition. 
Physical plant characteristics that can be modified include cell development (such as the number 

30 of trichomes), fruit and seed size and number, yields of plant parts such as stems, leaves and 

roots, the stability of the seeds during storage, characteristics of the seed pod (e.g., susceptibility 
to shattering), root hair length and quantity, internode distances, or the quality of seed coat. Plant 
growth characteristics that can be modified include growth rate, germination rate of seeds, vigor 
of plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering time, 
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flower abscission, rate of nitrogen uptake, biomass or transpiration characteristics, as well as 
plant architecture characteristics such as apical dominance, branching patterns, number of organs, 
organ identity, organ shape or size. 

POT VPPPTTDES AND POLYNUCLEOT IDES OF THE INVENTION 
5 The present invention provides, among other things, transcription factors (TFs), 

and transcription factor homologue polypeptides, and isolated or recombinant polynucleotides 
encoding the polypeptides. These polypeptides and polynucleotides may be employed to modify 
a plant's environmental stress tolerance. 

Exemplary polynucleotides encoding the polypeptides of the invention were 

1 0 identified in the Arabidopsis thaliana GenBank database using publicly available sequence 

analysis programs and parameters. Sequences initially identified were then further characterized 
to identify sequences comprising specified sequence strings corresponding to sequence motifs 
present in families of known transcription factors. Polynucleotide sequences meeting such 
criteria were confirmed as transcription factors. 

1 5 Additional polynucleotides of the invention were identified by screening 

Arabidopsis thaliana and/or other plant cDNA libraries with probes corresponding to known 
transcription factors under low stringency hybridization conditions. Additional sequences, 
including full length coding sequences were subsequently recovered by the rapid amplification of 
cDNA ends (RACE) procedure, using a commercially available kit according to the 

20 manufacturer's instructions. Where necessary, multiple rounds of RACE are performed to isolate 
5' and 3' ends. The full length cDNA was then recovered by a routine end-to-end polymerase 
chain reaction (PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences 
are provided in the Sequence Listing. 

The polynucleotides of the invention were ectopically expressed in overexpressor 

25 or knockout plants and changes in the environmental stress tolerance of the plants was observed. 
Therefore, the polynucleotides and polypeptides can be employed to improve the environmental 
stress resistance of plants. 

Making polynucleotides 

The polynucleotides of the invention include sequences that encode transcription 
30 factors and transcription factor homologue polypeptides and sequences complementary thereto, as 
well as unique fragments of coding sequence, or sequence complementary thereto. Such 
polynucleotides can be, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, 
cDNA synthetic DNA, oligonucleotides, etc. The polynucleotides are either double-stranded or 
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single-stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., non- 
coding, complementary) sequences. The polynucleotides include the coding sequence of a 
transcription factor, or transcription factor homologue polypeptide, in isolation, in combination 
with additional coding sequences (e.g., a purification tag, a localization signal, as a fusion- 
5 protein, as a pre-protein, or the like), in combination with non-coding sequences (e.g., introns or 
inteins, regulatory elements such as promoters, enhancers, terminators, and the like), and/or in a 
vector or host environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 

A variety of methods exist for producing the polynucleotides of the invention. 

10 Procedures for identifying and isolating DNA clones are well known to those of skill in the art, 
and are described in, e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques. Methods 
in Enzvmology volume 152 Academic Press, Inc., San Diego, CA ("Berger"); Sambrook et al., 
Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, 
Cold Spring Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology , 

1 5 F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing 
Associates, Inc. and John Wiley & Sons, Lie, (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety of 
in vitro amplification methods adapted to the present invention by appropriate selection of 
specific or degenerate primers. Examples of protocols sufficient to direct persons of skill through 

20 in vitro amplification methods, including the polymerase chain reaction (PCR) the ligase chain 
reaction (LCR), Qbeta-replicase amplification and other RNA polymerase mediated techniques 
(e.g., NASBA), e.g., for the production of the homologous nucleic acids of the invention are 
found in Berger, Sambrook, and Ausubel, as well as Mullis et al, (1987) PCR Protoco ls A Guide 
to Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis). 

25 Improved methods for cloning in vitro amplified nucleic acids are described in Wallace et al., 
U.S. Pat. No. 5,426,039. Improved methods for amplifying large nucleic acids by PCR are 
summarized in Cheng et al. (1994) Nature 369: 684-685 and the references cited therein, in which 
PCR amplicons of up to 40kb are generated. One of skill will appreciate that essentially any 
RNA can be converted into a double stranded DNA suitable for restriction digestion, PCR 

30 expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., Ausubel, 
Sambrook and Berger, all supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be 
assembled from fragments produced by solid-phase synthesis methods. Typically, fragments of 
up to approximately 1 00 bases are individually synthesized and then enzymatically or chemically 

10 
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ligated to produce a desired sequence, e.g., a polynucleotide encoding all or part of a 
transcription factor. For example, chemical synthesis using the phosphoramidite method is 
described, e.g., by Beaucage et al. (1981) Tetrahedron Letters 22:1859-69; and Matthes et al. 
(1984) EMBOi 3:801-5. According to such methods, oligonucleotides are synthesized, punned, 
annealed to their complementary strand, ligated and then optionally cloned into suitable vectors. 
And if so desired, the polynucleotides and polypeptides of the invention can be custom ordered 
from any of a number of commercial suppliers. 

HOMOLOGOUS SEQUENCES 

Sequences homologous, i.e., that share significant sequence identity or similarity, 
to those provided in the Sequence Listing, derived from Arabidopsis thaliana or from other plants 
of choice are also an aspect of the invention. Homologous sequences can be derived from any 
plant including monocots and dicots and in particular agriculturally important plant species, 
including but not limited to, crops such as soybean, wheat, corn, potato, cotton, rice, oilseed rape 
(including canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as 
banana, blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet com, tobacco, tomato, watermelon, rosaceous fruits (such as 
apple, peach, pear, cherry and plum) and vegetable brassicas (such as broccoli, cabbage, 
cauliflower, brussel sprouts and kohlrabi). Other crops, fruits and vegetables whose phenotype 
can be changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the walnut and 
peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, yam, and sweet 
potato, and beans. The homologous sequences may also be derived from woody species, such 

pine, poplar and eucalyptus. 

Transcription factors that are homologous to the listed sequences will typically 
share at least about 30% amino acid sequence identity. More closely related transcription factors 
can share at least about 50%, about 60%, about 65%, about 70%, about 75% or about 80% or 
about 90% or about 95% or about 98% or more sequence identity with the listed sequences. 
Factors that are most closely related to the listed sequences share, e.g., at least about 85%, about 
90% or about 95% or more % sequence identity to the listed sequences. At the nucleotide level, 
the sequences will typically share at least about 40% nucleotide sequence identity, preferably at 
least about 50%, about 60%, about 70% or about 80% sequence identity, and more preferably 
about 85%, about 90%, about 95% or about 97% or more sequence identity to one or more of the 
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listed sequences. The degeneracy of the geiietic code enables major variations in the nucleotide 
sequence of a polynucleotide while maintaining the amino acid sequence of the encoded protein. 
Conserved domains within a transcription factor family may exhibit a higher degree of sequence 
homology, such as at least 65% sequence identity including conservative substitutions, and 
5 preferably at least 80% sequence identity. 

Identifying Nucleic Acids bv Hybridization 

Polynucleotides homologous to the sequences illustrated in the Sequence Listing 
can be identified, e.g., by hybridization to each other under stringent or under highly stringent 
conditions. Single stranded polynucleotides hybridize when they associate based on a variety of 

1 0 well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base 
stacking and the like. The stringency of a hybridization reflects the degree of sequence identity 
of the nucleic acids involved, such that the higher the stringency, the more similar are the two 
polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, 
salt concentration and composition, organic and non-organic additives, solvents, etc. present in 

1 5 both the hybridization and wash solutions and incubations (and number), as described in more 
detail in the references cited above. 

An example of stringent hybridization conditions for hybridization of 
complementary nucleic acids which have more than 100 complementary residues on a filter in a 
Southern or northern blot is about 5°C to 20°C lower than the thermal melting point (Tm) for the 

20 specific sequence at a defined ionic strength and pH. The T m is the temperature (under defined 
ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched 
probe. Nucleic acid molecules that hybridize under stringent conditions will typically hybridize 
to a probe based on either the entire cDNA or selected portions, e.g., to a unique subsequence, of 
the cDNA under wash conditions of 0.2x SSC to 2.0 x SSC, 0.1% SDS at 50-65° C, for example 

25 0.2 x SSC, 0.1% SDS at 65° C. For identification of less closely related homologies washes can 
be performed at a lower temperature, e.g., 50° C. In general, stringency is increased by raising . 
the wash temperature and/or decreasing the concentration of SSC. 

As another example, stringent conditions can be selected such that an 
oligonucleotide that is perfectly complementary to the coding oligonucleotide hybridizes to the . 

30 .coding oligonucleotide with at least about a 5-1 Ox higher signal to noise ratio than the ratio for 
hybridization of the perfectly complementary oligonucleotide to a nucleic acid encoding a 
transcription factor known as of the filing date of the application. Conditions can be selected 
such that a higher signal to noise ratio is observed in the particular assay which is used, e.g., 
about 15x, 25x, 35x, 50x or more. Accordingly, the subject nucleic acid hybridizes to the unique 
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coding oligonucleotide with at least a 2x higher signal to noise ratio as compared to hybridization 
of the coding oligonucleotide to a nucleic acid encoding known polypeptide. Again, higher 
signal to noise ratios can be selected, e.g., about 5x, lOx, 25x, 35x, 50x or more. The particular 
signal will depend on the label used in the relevant assay, e.g., a fluorescent label, a colorimetric 

5 label, a radio active label, or the like. 

Alternatively, transcription factor homologue polypeptides can be obtained by 
screening an expression library using antibodies specific for one or more transcription factors. 
With the provision herein of the disclosed transcription factor, and transcription factor homologue 
nucleic acid sequences, the encoded polypeptide(s) can be expressed and purified in a 

10 heterologous expression system (e.g., E. coli) and used to raise antibodies (monoclonal or 
polyclonal) specific for the polypeptide(s) in question. Antibodies can also be raised against 
synthetic peptides derived from transcription factor, or transcription factor homologue, amino 
acid sequences. Methods of raising antibodies are well known in the art and are described in 
Harlow and Lane (1988) Antibodies: A Laboratory Manual , Cold Spring Harbor Laboratory, New 

1 5 York. Such antibodies can then be used to screen an expression library produced from the plant 
from which it is desired to clone additional transcription factor homologues, using the methods 
described above. The selected cDNAs can be confirmed by sequencing and enzymatic activity. 

SPOT IFM<~!F. VARIATIONS 

It will readily be appreciated by those of skill in the art, that any of a variety of 
20 polynucleotide sequences are capable of encoding the transcription factors and transcription 

factor homologue polypeptides of the invention. Due to the degeneracy of the genetic code, 

many different polynucleotides can encode identical and/or substantially similar polypeptides in 

addition to those sequences illustrated in the Sequence Listing. 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, 
25 TCG, and TCT all encode the same amino acid: serine. Accordingly, at each position in the 

sequence where there is a codon encoding serine, any of the above trinucleotide sequences can be 

used without altering the encoded polypeptide. 



13 



WO 01/36598 



PCT/US00/31458 



Table 1 



Amino acids 


Codon 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






Cysteine 


Cys 


C 


TGC 


TGT 










Aspartic acid 


Asp 


D 


GAC 


GAT 










Glutamic acid 


Glu 


E 


GAA 


GAG 










Phenylalanine 


Phe 


F 


TTC 


TTT 










Glycine 


Gly 


G 


GGA 


GGC 


GGG 


GGT 






Histidine 


His 


H 


CAC 


CAT 










Isoleucine 


He 


I 


ATA 


ATC 


ATT 








Lysine 


Lys 


K 


AAA 


AAG 










Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


CTT 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


ccc 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Trp 


w 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by the 
polynucleotide are termed "silent" variations. With the exception of the codons ATG and TGG, 
encoding methionine and tryptophan, respectively, any of the possible codons for the same amino 
acid can be substituted by a variety of techniques, e.g., site-directed mutagenesis, available in the 
art. Accordingly, any and all such variations of a sequence selected from the above table are a 
feature of the invention. 

In addition to silent variations, other conservative variations that alter one, or a 
few amino acids in the encoded polypeptide, can be made without altering the function of the 
polypeptide, these conservative variants are, likewise, a feature of the invention. 

For example, substitutions, deletions and insertions introduced into the sequences 
provided in the Sequence Listing are also envisioned by the invention. Such sequence 
modifications can be engineered into a sequence by site-directed mutagenesis (Wu (ed.) Meth. 
Enzvmol . (1993) vol. 217, Academic Press) or the other methods noted below. Amino acid 
substitutions are typically of single residues; insertions usually will be on the order of about from 
1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. In preferred 
embodiments, deletions or insertions are made in adjacent pairs, e.g., a deletion of two residues or 
insertion of two residues. Substitutions, deletions, insertions or any combination thereof can be 
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combined to arrive at a sequence. The mutations that are made in the polynucleotide encoding th 
transcription factor should not place the sequence out of reading frame and should not create 
complementary regions that could produce secondary mRNA structure. Preferably, the 
polypeptide encoded by the DNA performs the desired function. 

Conservative substitutions are those in which at least one residue in the amino 
acid sequence has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the Table 2 when it is desired to maintain the activity of 
the protein. Table 2 shows amino acids which can be substituted for an amino acid in a protein 
and which are typically regarded as conservative substitutions. 

Table 2 



Residue 


Conservative Substitutions 


Ala 


Ser 


Arg 


Lys 


^ Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


lie 


Leu, Val 


Leu 


lie; Val 


Lys 


Arg; Gin 


Met 


Leu; He 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser;Val 


Tip 


Tyr 


Tyr 


Tip; Phe 


Val 


lie; Leu 



Substitutions that are less conservative than those in Table 2 can be selected by 
picking residues that differ more significantly in their effect on maintaining (a) the structure of 
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the polypeptide backbone in the area of the substitution, for example, as a sheet or helical 
conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of 
the side chain. The substitutions which in general are expected to produce the greatest changes in 
protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is 
5 substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; 
(b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., 
phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

10 FURTHER MODIFYING SEQUENCES OF THE INVENTION— MUTATION/ FORCED 
EVOLUTION 

In addition to generating silent or conservative substitutions as noted, above, the 
present invention optionally includes methods of modifying the sequences of the Sequence 
Listing. In the methods, nucleic acid or protein modification methods are used to alter the given 

15 sequences to produce new sequences and/or to chemically or enzymatically modify given 
sequences to change the properties of the nucleic acids or proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., 
according to standard mutagenesis or artificial evolution methods to produce modified sequences. 
For example, Ausubel, supra, provides additional details on mutagenesis methods. Artificial 

20 forced evolution methods are described, e.g., by Stemmer ( 1 994) Nature 370:389-39 1 , and 
Stemmer (1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. Many other mutation and 
evolution methods are also available and expected to be within the skill of the practitioner. 

Similarly, chemical or enzymatic alteration of expressed nucleic acids and 
polypeptides can be performed by standard methods. For example, sequence can be modified by 

25 addition of lipids, sugars, peptides, organic or inorganic compounds, by the inclusion of modified 
nucleotides or amino acids, or the like. For example, protein modification techniques are 
illustrated in Ausubel, supra. Further details on chemical and enzymatic modifications can be 
found herein. These modification methods can be used to modify any given sequence, or to 
modify any sequence produced by the various mutation and artificial evolution modification 

30 methods noted herein. 

Accordingly, the invention provides for modification of any given nucleic acid 
by mutation, evolution, chemical or enzymatic modification, or other available methods, as well 
as for the products produced by practicing such methods, e.g., using the sequences herein as a 
starting substrate for the various modification approaches. 
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For example, optimized coding sequence containing codons preferred by a 
particular prokaryotic or eukaryotic host can be used e.g., to increase the rate of translation or to 
produce recombinant RNA transcripts having desirable properties, such as a longer half-life, as 
compared with transcripts produced using a non-optimized sequence. Translation stop codons 
can also be modified to reflect host preference. For example, preferred stop codons for S. 
cerevisiae and mammals are TAA and TGA, respectively. The preferred stop codon for 
monocotyledonous plants is TGA, whereas insects and E. coli prefer to use TAA as the stop 
codon. 

The polynucleotide sequences of the present invention can also be engineered in 
order to alter a coding sequence for a variety of reasons, including but not limited to, alterations 
which modify the sequence to facilitate cloning, processing and/or expression of the gene 
product. For example, alterations are optionally introduced using techniques which are well 
known in the art, e.g., site-directed mutagenesis, to insert new restriction sites, to alter 
glycosylation patterns, to change codon preference, to introduce splice sites, etc. 

Furthermore, a fragment or domain derived from any of the polypeptides of the 
invention can be combined with domains derived from other transcription factors or synthetic 
domains to modify the biological activity of a transcription factor. For instance, a DNA binding 
domain derived from a transcription factor of the invention can be combined with the activation 
domain of another transcription factor or with a synthetic activation domain. A transcription 
activation domain assists in initiating transcription from a DNA binding site. Examples include 
the transcription activation region of VP 16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. 
USA 95: 376-381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from 
bacterial sequences (Ma and Ptashne (1987) Cell 51; 1 13-119) and synthetic peptides (Giniger 
and Ptashne, (1987) Nature 330:670-672). 

EXPRESSION AND MODIFIC ATION OF POLYPEPTIDES 

Typically, polynucleotide sequences of the invention are incorporated into 
recombinant DNA (or RNA) molecules that direct expression of polypeptides of the invention in 
appropriate host cells, transgenic plants, in vitro translation systems, or the like. Due to the 
inherent degeneracy of the genetic code, nucleic acid sequences which encode substantially the 
same or a functionally equivalent amino acid sequence can be substituted for any listed sequence 
to provide for cloning and expressing the relevant homologue. 
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Vectors. Promoters and Expression Systems 

The present invention includes recombinant constructs comprising one or more 
of the nucleic acid sequences herein. The constructs typically comprise a vector, such as a 
plasmid, a cosmid, a phage, a virus (e.g., a plant virus), a bacterial artificial chromosome (BAC), 
5 a yeast artificial chromosome (YAC), or the like, into which a nucleic acid sequence of the 
invention has been inserted, in a forward or reverse orientation. In a preferred aspect of this 
embodiment, the construct further comprises regulatory sequences, including, for example, a 
promoter, operably linked to the sequence. Large numbers of suitable vectors and promoters are 
known to those of skill in the art, and are commercially available. 

1 0 General texts which describe molecular biological techniques useful herein, 

including the use and production of vectors, promoters and many other relevant topics, include 
Berger, Sambrook and Ausubel, supra. Any of the identified sequences can be incorporated into a 
cassette or vector, e.g., for expression in plants. A number of expression vectors suitable for stable 
transformation of plant cells or for the establishment of transgenic plants have been described 

1 5 including those described in Weissbach and Weissbach, ( 1 989> Methods for Plant Molecular 
Biology . Academic Press, and Gelvin et al., (1990) Plant Molecular Biology Manual . Kluwer 
Academic Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. (1983) Nature 
303: 209, Bevan (1984) Nucl Acid Res. 12: 871 1-8721, Klee (1985) Bio/Technology 3: 637-642, 

20 for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into 
monocotyledonous plants and cells by using free DNA delivery techniques. Such methods can 
involve, for example, the use of liposomes, electroporation, microprojectile bombardment, silicon 
carbide whiskers, and viruses. By using these methods transgenic plants such as wheat, rice 

25 (Christou (1991) Bio/Technology 9: 957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603- 
618) can be produced. An immature embryo can also be a good target tissue for monocots for 
direct DNA delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 
1077-1084; Vasil (1993) Bio/Technology 10: 667-674; Wan and Lemeaux (1994) Plant Physiol 
104: 37-48, and for Agrobacterium-mediated DNA transfer (Ishida et al. (1996) Nature Biotech 

30 14: 745-750). 

Typically, plant transformation vectors include one or more cloned plant coding 
sequence (genomic or cDNA) under the transcriptional control of 5' and 3* regulatory sequences 
and a dominant selectable marker. Such plant transformation vectors typically also contain a 
promoter (e.g., a regulatory region controlling inducible or constitutive, environmentally-or 
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developmentally-regulated, or cell- or tissue-specific expression), a transcription initiation start 
site, an RNA processing signal (such as intron splice sites), a transcription termination site, and/or 
a polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing the 
5 TF sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers 
constitutive, high-level expression in most plant tissues (see, e.g., Odel et al. (1985) Nature 
313:810); the nopaline synthase promoter (An et al. (1988) Plant Physiol 88:547); and the 
octopine synthase promoter (Fromm et al. (1989) Plant Cell 1: 977). 

A variety of plant gene promoters that regulate gene expression in response to 

10 environmental, hormonal, chemical, developmental signals, and in a tissue-active manner can be 
used for expression of a TF sequence in plants. Choice of a promoter is based largely on the 
phenotype of interest and is determined by such factors as tissue (e.g., seed, fruit, root, pollen, 
vascular tissue, flower, carpel, etc.), inducibility (e.g., in response to wounding, heat, cold, 
drought, light, pathogens, etc.), timing, developmental stage, and the like. Numerous known 

1 5 promoters have been characterized and can favorable be employed to promote expression of a 
polynucleotide of the invention in a transgenic plant or cell of interest. For example, tissue 
specific promoters include: seed-specific promoters (such as the napin, phaseolin or DC3 
promoter described in US Pat. No. 5,773,697), fruit-specific promoters that are active during fruit 
ripening (such as the dru 1 promoter (US Pat. No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 

20 4,943,674) and the tomato polygalacturonase promoter (Bird et al. ( 1 988) Plant Mol Biol 11:651 ), 
root-specific promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 
5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 5,792,929), 
promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol Biol 37:977-988), flower- 
specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), pollen (Baerson et al. (1994) Plant Mol 

25 Biol 26:1947-1959), carpels (Ohl et al (1990) Plant Cell 2:837-848), pollen and ovules (Baerson 
et al. (1993) Plant Mol Biol 22:255-267), auxin-inducible promoters (such as that described in 
van der Kop et al. (1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1:323- 
334), cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38:1053-1060, Willmott et 

30 al. (1998) 38:817-825) and the like. Additional promoters are those that elicit expression in 
response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), light (e.g., the pea rbcS-3A 
promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and the maize rbcS promoter, Schaffher and 
Sheen (1991) Plant Cell 3: 997); wounding (e.g., wunl, Siebertz et al. (1989) Plant Cell 1 : 961); 
pathogens (such as the PR-1 promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387- 
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396, and the PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
and chemicals such as methyl jasmonate or salicylic acid (Gate et al. (1997) Plant Mol Biol 48: 89- 
108). In addition, the timing of the expression can be controlled by using promoters such as those 
acting at senescence (An and Amazon (1995) Science 270: 1986-1 988); or late seed development 
5 (Odell et al. (1994) Plant Physiol 106:447-458). 

Plant expression vectors can also include RNA processing signals that can be 
positioned within, upstream or downstream of the coding sequence. In addition, the expression 
vectors can include additional regulatory sequences from the 3 '-untranslated region of plant 
genes, e.g., a 3' terminator region to increase mRNA stability of the mRNA, such as the PHI 
10 terminator region of potato or the octopine or nopaline synthase 3' terminator regions. 

Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. 
These signals can include, e.g., the ATG initiation codon and adjacent sequences. In cases where 
a coding sequence, its initiation codon and upstream sequences are inserted into the appropriate 

1 5 expression vector, no additional translational control signals may be needed. However, in cases 
where only coding sequence (e.g., a mature protein coding sequence), or a portion thereof, is 
inserted, exogenous transcriptional control signals including the ATG initiation codon can be 
separately provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of various origins, 

20 both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of 
enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with vectors 
of the invention, and the production of polypeptides of the invention (including fragments 

25 thereof) by recombinant techniques. Host cells are genetically engineered (i.e, nucleic acids are 
introduced, e.g., transduced, transformed or transfected) with the vectors of this invention, which 
may be, for example, a cloning vector or an expression vector comprising the relevant nucleic 
acids herein. The vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acids, 
etc. The engineered host cells can be cultured in conventional nutrient media modified as 

30 appropriate for activating promoters, selecting transformants, or amplifying the relevant gene. 
The culture conditions, such as temperature, pH and the like, are those previously used with the 
host cell selected for expression, and will be apparent to those skilled in the art and in the 
references cited herein, including, Sambrook and Ausubel. 
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The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or the 
host cell can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are also suitable for 
some applications. For example, the DNA fragments are introduced into plant tissues, cultured 
plant cells or plant protoplasts by standard methods including electroporation (Fromm et al., 
5 (1985) Proc. Natl. Acad. Sci. USA 82, 5824, infection by viral vectors such as cauliflower mosaic 
virus (CaMV) (Hohn et al., (1982) Molecular Biology of Plant Tumors, (Academic Press, New 
York) pp. 549-560; US 4,407,956), high velocity ballistic penetration by small particles with (he 
nucleic acid either within the matrix of small beads or particles, or on the surface (Klein et al., 
(1987) Nature 327, 70-73), use of pollen as vector (WO 85/01856), or use of Agrobacterium 

1 0 tumefaciens or A. rhizogenes carrying a T-DNA plasmid in which DNA fragments are cloned. 
The T-DNA plasmid is transmitted to plant cells upon infection by Agrobacterium tumefaciens, 
and a portion is stably integrated into the plant genome (Horsch et al. (1984) Science 233:496- 
498; Fraley et al. (1983) Proc. Natl. Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a polypeptide, 

1 5 wherein the cells expresses a polypeptide of the invention. The cell can also include vector 

sequences, or the like. Furthermore, cells and transgenic plants which include any polypeptide or 
nucleic acid above or throughout this specification, e.g., produced by transduction of a vector of 
the invention, are an additional feature of the invention. 

For long-term, high-yield production of recombinant proteins, stable expression 

20 can be used. Host cells transformed with a nucleotide sequence encoding a polypeptide of the 
invention are optionally cultured under conditions suitable for the expression and recovery of the 
encoded protein from cell culture. The protein or fragment thereof produced by a recombinant 
cell may be secreted, membrane-bound, or contained intracellularly, depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art, expression vectors 

25 containing polynucleotides encoding mature proteins of the invention can be designed with signal 
sequences which direct secretion of the mature polypeptides through a prokaryotic or eukaryotic 
cell membrane. 

Modified Amino Acids 

Polypeptides of the invention may contain one or more modified amino acids. 
30 The presence of modified amino acids may be advantageous in, for example, increasing 

polypeptide half-life, reducing polypeptide antigenicity or toxicity, increasing polypeptide storage 
stability, or the like. Amino acid(s) are modified, for example, co-translationally or post- 
translationally during recombinant production or modified by synthetic or chemical means. 
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Non-limiting examples of a modified amino acid include incorporation or other 
use of acetylated amino acids, glycosylated amino acids, sulfated amino acids, prenylated (e.g., 
farnesylated, geranylgeranylated) amino acids, PEG modified (e.g., "PEGylated") amino acids, 
biotinylated amino acids, carboxylated amino acids, phosphorylated amino acids, etc. References 
adequate to guide one of skill in the modification of amino acids are replete throughout the 
literature. 

IDENTIFICATION OF ADDITIONAL FACTORS 

A transcription factor provided by the present invention can also be used to 
identify additional endogenous or exogenous molecules that can affect a phentoype or trait of 
interest. On the one hand, such molecules include organic (small or large molecules) and/or 
inorganic compounds that affect expression of (i.e., regulate) a particular transcription factor. 
Alternatively, such molecules include endogenous molecules that are acted upon either at a 
transcriptional level by a transcription factor of the invention to modify a phenotype as desired. 
For example, the transcription factors can be employed to identify one or more downstream gene 
with which is subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in a host cell, 
e.g, a transgenic plant cell, tissue or explant, and expression products, either RNA or protein, of 
likely or random targets are monitored, e.g., by hybridization to a microarray of nucleic acid 
probes corresponding to genes expressed in a tissue or cell type of interest, by two-dimensional 
gel electrophoresis of protein products, or by any other method known in the art for assessing 
expression of gene products at the level of RNA or protein. Alternatively, a transcription factor 
of the invention can be used to identify promoter sequences (i.e., binding sites) involved in the 
regulation of a downstream target. After identifying a promoter sequence, interactions between 
the transcription factor and the promoter sequence can be modified by changing specific 
nucleotides in the promoter sequence or specific amino acids in the transcription factor that 
interact with the promoter sequence to alter a plant trait. Typically, transcription factor DNA 
binding sites are identified by gel shift assays. After identifying the promoter regions, the 
promoter region sequences can be employed in double-stranded DNA arrays to identify 
molecules that affect the interactions of the transcription factors with their promoters (Bulyk et al. 
(1999) Nature Biotechnology 17:573-577Y 

The identified transcription factors are also useful to identify proteins that modify 
the activity of the transcription factor. Such modification can occur by covalent modification, 
such as by phosphorylation, or by protein-protein (homo or-heteropolymer) interactions. Any 
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method suitable for detecting protein-protein interactions can be employed. Among the methods 
that can be employed are co-immunopTecipitation, cross-linking and co-purification through 
gradients or chromatographic columns, and the two-hybrid yeast system. 

The two-hybrid system detects protein interactions in vivo and is described in 

5 Chien, et al., (1 991), P™ Natl. Acad. Sci. USA 88, 9578-9582 and is commercially available 
from Clontech (Palo Alto, Calif.). In such a system, plasmids are constructed that encode two 
hybrid proteins: one consists of the DNA-binding domain of a transcription activator protein 
fused to the TF polypeptide and the other consists of the transcription activator protein's 
activation domain fused to an unknown protein that is encoded by a cDNA that has been 

10 recombined into the plasmid as part of a cDNA library. The DNA-binding domain fusion plasmid 
and the cDNA library are transformed into a strain of the yeast Saccharomyces cerevisiae that 
contains a reporter gene (e.g., lacZ) whose regulatory region contains the transcription activator's 
binding site. Either hybrid protein alone cannot activate transcription of the reporter gene. 
Interaction of the two hybrid proteins reconstitutes the functional activator protein and results in 

1 5 expression of the reporter gene, which is detected by an assay for the reporter gene product. Then, 
the library plasmids responsible for reporter gene expression are isolated and sequenced to 
identify the proteins encoded by the library plasmids. After identifying proteins that interact with 
the transcription factors, assays for compounds that interfere with the TF protein-protein 
interactions can be preformed. 

20 IDENTIFICATION OF MODULATORS 

Li addition to the intracellular molecules described above, extracellular 
molecules that alter activity or expression of a transcription factor, either directly or indirectly, 
can be identified. For example, the methods can entail first placing a candidate molecule in 
contact with a plant or plant cell. The molecule can be introduced by topical administration, such 

25 as spraying or soaking of a plant, and then the molecule's effect on the expression or activity of 
the TF polypeptide or the expression of the polynucleotide monitored. Changes in the expression 
of the TF polypeptide can be monitored by use of polyclonal or monoclonal antibodies, gel 
electrophoresis or the like. Changes in the expression of the corresponding polynucleotide 
sequence can be detected by use of microarrays, Northerns, quantitative PCR, or any other 

30 technique for monitoring changes in mRNA expression. These techniques are exemplified in 
Ausubel et al. (eds) Current Protocols in Molecu lar Biology. John Wiley & Sons (1998). Such 
changes in the expression levels can be correlated with modified plant traits and thus identified 



23 



WO 01/36598 



PCT/US00/31458 



molecules can be useful for soaking or spraying on fruit, vegetable and grain crops to modify 
traits in plants. 

Essentially any available composition can be tested for modulatory activity of 
expression or activity of any nucleic acid or polypeptide herein. Thus, available libraries of 
5 compounds such as chemicals, polypeptides, nucleic acids and the like can be tested for 
modulatory activity. Often, potential modulator compounds can be dissolved in aqueous or 
organic (e.g., DMSO-based) solutions for easy delivery to the cell or plant of interest in which the 
activity of the modulator is to be tested. Optionally, the assays are designed to screen large 
modulator composition libraries by automating the assay steps and providing compounds from 

10 any convenient source to assays, which are typically run in parallel (e.g., in microtiter formats on 
microtiter plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential modulator 
compounds). Such "combinatorial chemical libraries" are then screened in one or more assays, as 

15 described herein, to identify those library members (particular chemical species or subclasses) 
that display a desired characteristic activity. The compounds thus identified can serve as target 
compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 

20 combinatorial chemical library such as a polypeptide library is formed by combining a set of 
chemical building blocks (e.g., in one example, amino acids) in every possible way for a given 
compound length (i.e., the number of amino acids in a polypeptide compound of a set length). 
Exemplary libraries include peptide libraries, nucleic acid libraries, antibody libraries (see, e.g., 
Vaughn et al. (1996) Nature Biotechnology . 14(3):309-3 14 and PCT/US96/ 10287), carbohydrate 

25 libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule 
libraries (see, e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. 
Patent 5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. 
Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 5,506,337) and the like. 

30 Preparation and screening of combinatorial or other libraries is well known to 

those of skill in the art. Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent 5,010,175, Furka, Int. J. Pent. Prot. Res. 37:487^493 
(1991) and Houghton et al. Nature 354:84-88 (1991)). Other chemistries for generating chemical 
diversity libraries can also be used. 
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In addition, as noted, compound screening equipment for high-throughput 
screening is generally available, e.g., using any of a number of well known robotic systems that 
have also been developed for solution phase chemistries useful in assay systems. These systems 
include automated workstations including an automated synthesis apparatus and robotic systems 
5 utilizing robotic arms. Any of the above devices are suitable for use with the present invention, 
e.g., for high-throughput screening of potential modulators. The nature and implementation of 
modifications to these devices (if any) so that they can operate as discussed herein will be 
apparent to persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. 

1 0 These systems typically automate entire procedures including all sample and reagent pipetting, 
liquid dispensing, timed incubations, and final readings of the microplate in detectors) 
appropriate for the assay. These configurable systems provide high throughput and rapid start up 
as well as a high degree of flexibility and customization. Similarly, microfluidic implementations 
of screening are also commercially available. 

1 5 The manufacturers of such systems provide detailed protocols the various high 

throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening 
systems for detecting the modulation of gene transcription, ligand binding, and the like. The 
integrated systems herein, in addition to providing for sequence alignment and, optionally, 
synthesis of relevant nucleic acids, can include such screening apparatus to identify modulators 

20 that have an effect on one or more polynucleotides or polypeptides according to the present 
invention. 

In some assays it is desirable to have positive controls to ensure that the 
components of the assays are working properly. At least two types of positive controls are 
appropriate. That is, known transcriptional activators or inhibitors can be incubated with 

25 cells/plants/ etc. in one sample of the assay, and the resulting increase/decrease in transcription 

can be detected by measuring the resulting increase in RNA/ protein expression, etc., according to 
the methods herein. It will be appreciated that modulators can also be combined with 
transcriptional activators or inhibitors to find modulators which inhibit transcriptional activation 
or transcriptional repression. Either expression of the nucleic acids and proteins herein or any 

30 additional nucleic acids or proteins activated by the nucleic acids or proteins herein, or both, can 
be monitored. 

In an embodiment, the invention provides a method for identifying compositions 
that modulate the activity or expression of a polynucleotide or polypeptide of the invention. For 
example, a test compound, whether a small or large molecule, is placed in contact with a cell, 
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plant (or plant tissue or explant), or composition comprising the polynucleotide or polypeptide of 
interest and a resulting effect on the cell, plant, (or tissue or explant) or composition is evaluated 
by monitoring, either directly or indirectly, one or more of: expression level of the polynucleotide 
or polypeptide, activity (or modulation of the activity) of the polynucleotide or polypeptide. In 
5 some cases, an alteration in a plant phenotype can be detected following contact of a plant (or 
plant cell, or tissue or explant) with the putative modulator, e.g., by modulation of expression or 
activity of a polynucleotide or polypeptide of the invention. 



SUB SEQUENCES 

10 Also contemplated are uses of polynucleotides, also referred to herein as 

oligonucleotides, typically having at least 12 bases, preferably at least 15, more preferably at least 
20, 30, or 50 bases, which hybridize under at least highly stringent (or ultra-high stringent or 
ultra-ultra- high stringent conditions) conditions to a polynucleotide sequence described above. 
The polynucleotides may be used as probes, primers, sense and antisense agents, and the like, 

15 according to methods as noted supra. 

Subsequences of the polynucleotides of the invention, including polynucleotide 
fragments and oligonucleotides are useful as nucleic acid probes and primers. An oligonucleotide 
suitable for use as a probe or primer is at least about 15 nucleotides in length, more often at least 
about 18 nucleotides, often at least about 21 nucleotides, frequently at least about 30 nucleotides, 

20 or about 40 nucleotides, or more in length. A nucleic acid probe is useful in hybridization 
protocols, e.g., to identify additional polypeptide homologues of the invention, including 
protocols for microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA 
strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer 

25 pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain 

reaction (PCR) or other nucleic-acid amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide 
including a subsequence of at least about 15 contiguous amino acids encoded by the recombinant 
or isolated polynucleotides of the invention. For example, such polypeptides, or domains or 

30 fragments thereof, can be used as immunogens, e.g., to produce antibodies specific for the 

polypeptide sequence, or as probes for detecting a sequence of interest. A subsequence can range 
in size from about 15 amino acids in length up to and including the full length of the polypeptide. 
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PRODUCTION OF TRANSGENIC PLANTS 
Modification of Traits 

The polynucleotides of the invention are favorably employed to produce 
"transgenic plants with various traits, or characteristics, that have been modified in a desirable 

5 manner, e.g., to improve the environmental stress resistance of a plant. For example, alteration of 
expression levels or patterns (e.g., spatial or temporal expression patterns) of one or more of the 
transcription factors (or transcription factor homologues) of the invention, as compared with the 
levels of the same protein found in a wild type plant, can be used to modify a plant's traits. An 
illustrative example of trait modification, improved environmental stress tolerance, by altering 

1 0 expression levels of a particular transcription factor is described further in the Examples and the 
Sequence Listing. 

Antisense and Cosupp ression Approaches 

In addition to expression of the nucleic acids of the invention as gene 
replacement or plant phenotype modification nucleic acids, the nucleic acids are also useful for 

15 sense and anti-sense suppression of expression, e.g., to down-regulate expression of a nucleic 
acid of the invention, e.g., as a further mechanism for modulating plant phenotype. That is, the 
nucleic acids of the invention, or subsequences or anti-sense sequences thereof, can be used to 
block expression of naturally occurring homologous nucleic acids. A variety of sense and anti- 
sense technologies are known in the art, e.g., as set forth in Lichtenstein and Nelien (1997) 

20 Antisense Technology: A Practical Approach IRL Press at Oxford University, Oxford, England. 
In general, sense or anti-sense sequences are introduced into a cell, where they are optionally 
amplified, e.g., by transcription. Such sequences include both simple oligonucleotide sequences 
and catalytic sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out'*) of a 

25 transcription factor or transcription factor homologue polypeptide in a transgenic plant, e.g., to 
modify a plant trait, can be obtained by introducing an antisense construct corresponding to the 
polypeptide of interest as a cDNA. For antisense suppression, the transcription factor or homologue 
cDNA is arranged in reverse orientation (with respect to the coding sequence) relative to the 
promoter sequence in the expression vector. The introduced sequence need not be the full length 

30 cDNA or gene, and need not be identical to the cDNA or gene found in the plant type to be 

transformed. Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RN A of interest. Thus, where the introduced sequence is of shorter length, a higher 
degree of homology to the endogenous transcription factor sequence will be needed for effective 
antisense suppression. While antisense sequences of various lengths can be utilized, preferably, 
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the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and 
improved antisense suppression will typically be observed as the length of the antisense sequence 
increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 
nucleotides. Transcription of an antisense construct as described results in the production of 
5 RNA molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 

Suppression of endogenous transcription factor gene expression can also be 
achieved using a ribozyme. Ribozymes are RNA molecules that possess highly specific 
endoribonuclease activity. The production and use of ribozymes are disclosed in U.S. Patent No. 

10 4,987,071 and U.S. Patent No. 5,543,508. Synthetic ribozyme sequences including antisense 
RNAs can be used to confer RNA cleaving activity on the antisense RNA, such that endogenous 
mRNA molecules that hybridize to the antisense RNA are cleaved, which in turn leads to an 
enhanced antisense inhibition of endogenous gene expression. 

Vectors in which RNA encoded by a transcription factor or transcription factor 

15 homologue cDNA is over-expressed can also be used to obtain co-suppression of a corresponding 
endogenous gene, e.g., in the manner described in U.S. Patent No. 5,23 1,020 to Jorgensen. Such 
co-suppression (also termed sense suppression) does not require that the entire transcription factor 
cDNA be introduced into the plant cells, nor does it require that the introduced sequence be 
exactly identical to the endogenous transcription factor gene of interest. However, as with 

20 antisense suppression, the suppressive efficiency will be enhanced as specificity of hybridization 
is increased, e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, e.g., 
sequences comprising one or more stop codon, or nonsense mutation) can also be used to 

25 suppress expression of an endogenous transcription factor, thereby reducing or eliminating it's 
activity and modifying one or more traits. Methods for producing such constructs are described 
in U.S. Patent No. 5,583,021 . Preferably, such constructs are made by introducing a premature 
stop codon into the transcription factor gene. Alternatively, a plant trait can be modified by gene 
silencing using double-strand RNA (Sharp (1999) Genes and Development 13: 139-141). 

30 Another method for abolishing the expression of a gene is by insertion 

mutagenesis using the T-DNA of Agrobacterium tumefaciens. After generating the insertion 
mutants, the mutants can be screened to identify those containing the insertion in a transcription 
factor or transcription factor homologue gene. Plants containing a single transgene insertion 
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event at the desired gene can be crossed to generate homozygous plants for the mutation (Koncz 
et al. (1992) Methods in Arabidopsis Research, World Scientific). 

Alternatively, a plant phenotype can be altered by eliminating an endogenous 
gene, such as a transcription factor or transcription factor homologue, e.g., by homologous 
5 recombination (Kempin et al. (1997) Nature 389:802). 

A plant trait can also be modified by using the cre-lox system (for example, as 
described in US Pat. No. 5,658,772). A plant genome can be modified to include first and 
second lox sites that are then contacted with a Cre recombinase. If the lox sites are in the same 
orientation, the intervening DNA sequence between the two sites is excised. If the lox sites are in 

1 0 the opposite orientation, the intervening sequence is inverted. 

The polynucleotides and polypeptides of this invention can also be expressed in a 
plant in the absence of an expression cassette by manipulating the activity or expression level of 
the endogenous gene by other means. For example, by ectopically expressing a gene by T-DNA 
activation tagging (Ichikawa et al. (1997) Nature 390 698-701; Kakimoto et al. (1996) Science 

1 5 274: 982-985). This method entails transforming a plant with a gene tag containing multiple 

transcriptional enhancers and once the tag has inserted into the genome, expression of a flanking 
gene coding sequence becomes deregulated. In another example, the transcriptional machinery in 
a plant can be modified so as to increase transcription levels of a polynucleotide of the invention 
{See, e.g., PCT Publications WO 96/06166 and WO 98/53057 which describe the modification of 

20 the DNA binding specificity of zinc finger proteins by changing particular amino acids in the 
DNA binding motif). 

The transgenic plant can also include the machinery necessary for expressing or 
altering the activity of a polypeptide encoded by an endogenous gene, for example by altering the 
phosphorylation state of the polypeptide to maintain it in an activated state. 

25 Transgenic plants (or plant cells, or plant explants, or plant tissues) incorporating 

the polynucleotides of the invention and/or expressing the polypeptides of the invention can be 
produced by a variety of well established techniques as described above. Following construction 
of a vector, most typically an expression cassette, including a polynucleotide, e.g., encoding a 
transcription factor or transcription factor homologue, of the invention, standard techniques can 

30 be used to introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce a transgenic 
plant. 

The plant can be any higher plant, including gymnosperms, monocotyledonous 
and dicotyledenous plants. Suitable protocols are available for Leguminosae (alfalfa, soybean, 
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clover, etc.), Umbelliferae (carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed, 
broccoli, etc.), Curcurbitaceae (melons and cucumber), Gramineae (wheat, corn, rice, barley, 
millet, etc.), Solanaceae (potato, tomato, tobacco, peppers, etc.), and various other crops. See 
protocols described in Ammirato et al. (1984) Handbook of Plant Cell Cul ture -Crop Species. 
5 Macmillan Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) 
Bio/Technology 8:833-839; and Vasil et al. (1990) Biotechnology 8:429-434. 

Transformation and regeneration of both monocotyledonous and dicotyledonous 
plant cells is now routine, and the selection of the most appropriate transformation technique will 
be determined by the practitioner. The choice of method will vary with the type of plant to be 
10 transformed; those skilled in the art will recognize the suitability of particular methods for given 
plant types. Suitable methods can include, but are not limited to: electroporation of plant 
protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated 
transformation; transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacterium tumeficiens mediated 
1 5 transformation. Transformation means introducing a nucleotide sequence in a plant in a manner to 
cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by 
transformation with cloned sequences which serve to illustrate the current knowledge in this field 
of technology, and which are herein incorporated by reference, include: U.S. Patent Nos. 
20 5,571,706; 5,677,175; 5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526; 
5,780,708; 5,538,880; 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant 
selectable marker incorporated into the transformation vector. Typically, such a marker will 
confer antibiotic or herbicide resistance on the transformed plants, and selection of transformants 
25 can be accomplished by exposing the plants to appropriate concentrations of the antibiotic or 
herbicide. 

After transformed plants are selected and grown to maturity, those plants 
showing a modified trait are identified. The modifed trait can be any of those traits described 
above. Additionally, to confirm that the modified trait is due to changes in expression levels or 
30 activity of the polypeptide or polynucleotide of the invention can be determined by analyzing 
mRNA expression using Northern blots, RT-PCR or microarrays, or protein expression using 
immunoblots or Western blots or gel shift assays. 
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gjTEGg A -rcr ' gvsTPMS-SEOUENCF. IDENTITY 

Additionally, the present invention may be an integrated system, computer or 

computer readable medium that comprises an instruction set for determining the identity of one or 
m0 re sequences in a database. In addition, the instruction set can be used to generate or identify 
5 sequences that meet any specified criteria. Furthermore, the instruction set may be used to 
associate or link certain functional benefits, such improved environmental stress tolerance, with 

one or more identified sequence. 

For example, the instruction set can include, e.g., a sequence companson or other 
alignment program, e.g., an available program such as, for example, the Wisconsin Package 
10 Version 1 0.0, such as BLAST, FASTA, PILEUP, FINDPATTERNS or the like (GCG, Madision, 
WD Public sequence databases such as GenBank, EMBL, Swiss-Prot and FIR or pnvate 
sequence databases such as PhytoSeq (Incyte Pharmaceuticals, Palo Alto, CA) can be searched. 

Alignment of sequences for comparison can be conducted by the local homology 
algorithm of Smith and Waterman (1981) A^LM^L 2:482, by the homology alignment 
1 5 algorithm of Needleman and Wunsch (1970) LMgLBifiL 48:443, by the search for similarity 
met hod of Pearson and Lipman (1988) Tr^JM^S^VSJ, 85: 2444, by computerized 
implementations of these algorithms. After alignment, sequence comparisons between two (or 
more) polynucleotides or polypeptides are typically performed by comparing sequences of the 
two sequences over a comparison window to identify and compare local regions of sequence 
20 similarity. The comparison window can be a segment of at least about 20 contiguous positions, 
usually about 50 to about 200, more usually about 100 to about 150 contiguous positions. A 
description of the method is provided in Ausubel et al., supra. 

A variety of methods of determining sequence relationships can be used, 
including manual alignment and computer assisted sequence alignment and analysis. This later 
25 approach is a preferred approach in the present invention, due to the increased throughput 
afforded by computer assisted methods. As noted above, a variety of computer programs for 
performing sequence alignment are available, or can be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence identity 
and sequence similarity is the BLAST algorithm, which is described in Altschul et al. LMpIBiol 
30 215 403-410 (1990). Software for performing BLAST analyses is publicly available, e.g., 

through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 
algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
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referred to as the neighborhood word score threshold (Altschul et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. 
The word hits are then extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, for 
5 nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 
0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a 
scoring matrix is used to calculate the cumulative score. Extension of the word hits in each 
direction are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of 

10 one or more negative-scoring residue alignments; or the end of either sequence is reached. The 
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. 
The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1 , an 
expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino 
acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) 

15 of 1 0, and the BLOSUM62 scoring matrix {see Henikoff & Henikoff (1989 ) Proc.Natl. Acad. 
SciUSA 89:10915). 

In addition to calculating percent sequence identity, the BLAST algorithm also 
performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & 
Altschul (1993) Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided 

20 by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of 
the probability by which a match between two nucleotide or amino acid sequences would occur 
by chance. For example, a nucleic acid is considered similar to a reference sequence (and, 
therefore, in this context, homologous) if the smallest sum probability in a comparison of the test 
nucleic acid to the reference nucleic acid is less than about 0. 1 , or less than about 0.01 , and or 

25 even less than about 0.001 . An additional example of a useful sequence alignment algorithm is 
PDJEUP. PELEUP creates a multiple sequence alignment from a group of related sequences using 
progressive, pairwise alignments. The program can align, e.g., up to 300 sequences of a 
maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface 

30 allowing a user to selectively view one or more sequence records corresponding to the one or 

more character strings, as well as an instruction set which aligns the one or more character strings 
with each other or with an additional character string to identify one or more region of sequence 
similarity. The system may include a link of one or more character strings with a particular 
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phenotype or gene function. Typically, the system includes a user readable output element which 
displays an alignment produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented on a single 
computer comprising multiple processors or on a multiplicity of computers. The computers can 
be linked, e.g. through a common bus, but more preferably the computer(s) are nodes on a 
network. The network can be a generalized or a dedicated local or wide-area network and, in 
certain preferred embodiments, the computers may be components of an intra-net or an internet. 

Thus, the invention provides methods for identifying a sequence similar or 
homologous to one or more polynucleotides as noted herein, or one or more target polypeptides 
encoded by the polynucleotides, or otherwise noted herein and may include linking or associating 
a given plant phenotype or gene function with a sequence. In the methods, a sequence database is 
provided (locally or across an inter or intra net) and a query is made against the sequence 
database using the relevant sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying 
the database. This provides for both expansion of the database and, if done before the querying 
step, for insertion of control sequences into the database. The control sequences can be detected 
by the query to ensure the general integrity of both the database and the query. As noted, the 
query can be performed using a web browser based interface. For example, the database can be a 
centralized public database such as those noted herein, and the querying can be done from a 
remote terminal or computer across an internet or intranet. 

EXAMPLES 

The following examples are intended to illustrate but not limit the present 

invention. 

EXAMPLE I. FULL LENGTH GENE IDENTIFICATION AND CLONING 

Putative transcription factor sequences (genomic or ESTs) related to known 
transcription factors were identified in the Arabidopsis thaliana GenBank database using the 
tblastn sequence analysis program using default parameters and a P-value cutoff threshold of -4 
or -5 or lower, depending on the length of the query sequence. Putative transcription factor 
sequence hits were then screened to identify those containing particular sequence strings. If the 
sequence hits contained such sequence strings, the sequences were confirmed as transcription 
factors. 
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Alternatively, Arabidopsis thaliana cDNA libraries derived from different tissues 
or treatments, or genomic libraries were screened to identify novel members of a transcription 
family using a low stringency hybridization approach. Probes were synthesized using gene 
specific primers in a standard PCR reaction (annealing temperature 60° C) and labeled with 32 P 
5 dCTP using the High Prime DNA Labeling Kit (Boehringer Mannheim). Purified radiolabelled 
probes were added to filters immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 
7% SDS, 1 % w/v bovine serum albumin) and hybridized overnight at 60 °C with shaking. Filters 
were washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5' or 3' of a partial cDNA sequence in a cDNA 
10 library, 5' and 3' rapid amplification of cDNA ends (RACE) was performed using the Marathon™ 
cDNA amplification kit (Clontech, Palo Alto, CA). Generally, the method entailed first isolating 
poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded 
cDNA, blunting cDNA ends, followed by ligation of the Marathon™ Adaptor to the cDNA to 
form a library of adaptor-ligated ds cDNA. 
1 5 Gene-specific primers were designed to be used along with adaptor specific 

primers for both 5 1 and 3 f RACE reactions. Nested primers, rather than single primers, were used 
to increase PCR specificity. Using 5' and 3' RACE reactions, 5* and 3' RACE fragments were 
obtained, sequenced and cloned. The process can be repeated until 5' and 3' ends of the full- 
length gene were identified. Then the full-length cDNA was generated by PCR using primers 
20 specific to 5' and 3* ends of the gene by end-to-end PCR. 

EXAMPLE n. CONSTRUCTION OF EXPRESSION VECTORS 

The sequence was amplified from a genomic or cDNA library using primers 
specific to sequences upstream and downstream of the coding region. The expression vector was 
pMEN20 or pMEN65, which are both derived from pMON3 1 6 (Sanders et al, (1 987 ) Nucleic 

25 Acids Research 15:1543-58) and contain the CaMV 35S promoter to express transgenes. To 

clone the sequence into the vector, both pMEN20 and the amplified DNA fragment were digested 
separately with Sail and NotI restriction enzymes at 37° C for 2 hours. The digestion products 
were subject to electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide 
staining. The DNA fragments containing the sequence and the linearized plasmid were excised 

30 and purified by using a Qiaquick gel extraction kit (Qiagen, CA). The fragments of interest were 
ligated at a ratio of 3:1 (vector to insert). Ligation reactions using T4 DNA ligase (New England . 
Biolabs, MA) were carried out at 16° C for 16 hours. The ligated DNAs were transformed into 
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competent cells of the E. coli strain DH5alpha by using the heat shock method. The 
transformations were plated on LB plates containing 50 mgfl kanamycin (Sigma). 

Individual colonies were grown overnight in five milliliters of LB broth 
containing 50 mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini 
5 Prep kits (Qiagen, CA). 

EXAMPLE BL TRANSFO RMATION OF A GROBA CTERIUM WITH THE EXPRESSION 
VECTOR 

After the plasmid vector containing the gene was constructed, the vector was 
used to transform Agrobacterium tumefaciens cells expressing the gene products. The stock of 

1 0 Agrobacterium tumefaciens cells for transformation were made as described by Nagel et al. 

(1990) FEMS Microbiol Letts . 67: 325-328. Agrobacterium strain ABI was grown in 250 ml LB 
medium (Sigma) overnight at 28°C with shaking until an absorbance (A^oo) of 0.5 - 1 .0 was 
reached. Cells were harvested by centrifugation at 4,000 x g for 15 min at 4°C. Cells were then 
resuspended in 250 \i\ chilled buffer (1 mM HEPES, P H adjusted to 7.0 with KOH). Cells were 

1 5 centrifuged again as described above and resuspended in 125 \i\ chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described above at a 
volume of 100 \i\ and 750 jil, respectively. Resuspended cells were then distributed into 40 nl 
aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described 

20 above following the protocol described by Nagel et al. For each DNA construct to be 

transformed, 50- 100 ng DNA (generally resuspended in 10 mM Tris-HCl, I mM EDTA, pH 
8.0) was mixed with 40 yl of Agrobacterium cells. The DNA/cell mixture was then transferred to 
a chilled cuvette with a 2mm electrode gap and subject to a 2.5 kV charge dissipated at 25 pF and 
200 \iF using a Gene Pulser H apparatus (Bio-Rad). After electroporation, cells were 

25 immediately resuspended in 1 .0 ml LB and allowed to recover without antibiotic selection for 2 - 
4 hours at 28° C in a shaking incubator. After recovery, cells were plated onto selective medium 
of LB broth containing 100 ng/ml spectinomycin (Sigma) and incubated for 24-48 hours at 28° C. 
Single colonies were then picked and inoculated in fresh medium. The presence of the plasmid 
construct was verified by PCR amplification and sequence analysis. 

30 EXAMPLE IV. TRANSFORMATION OF ARABIDOPSIS PLA NTS WITH AGROBACTERIUM 
TUMEFACIENS WITH EXPRESSION VECTOR 

After transformation of Agrobacterium tumefaciens with plasmid vectors 
containing the gene, single Agrobacterium colonies were identified, propagated, and used to 
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transform Arabidopsis plants. Briefly, 500 ml cultures of LB medium containing 50 mg/1 
kanamycin were inoculated with the colonies and grown at 28° C with shaking for 2 days until an 
absorbance (A^oo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g 
for 10 min, and resuspended in infiltration medium (1/2 X Murashige and Skoog salts (Sigma), 1 
5 X Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 jiM benzylamino purine 
(Sigma), 200 \i\TL Silwet L-77 (Lehle Seeds) until an absorbance (A^oo) of 0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were 
sown at a density of —10 plants per 4" pot onto Pro-Mix BX potting medium (Hummert 
International) covered with fiberglass mesh (18 mm X 16 mm). Plants were grown under 
10 continuous illumination (50-75 nE/m 2 /sec) at 22-23° C with 65-70% relative humidity. After 
about 4 weeks, primary inflorescence stems (bolts) are cut off to encourage growth of multiple 
secondary bolts. After flowering of the mature secondary bolts, plants were prepared for 
transformation by removal of all siliques and opened flowers. 

The pots were then immersed upside down in the mixture of Agrobacterium 
15 infiltration medium as described above for 30 sec, and placed on their sides to allow draining into 
a 1* x T flat surface covered with plastic wrap. After 24 h, the plastic wrap was removed and 
pots are turned upright. The immersion procedure was repeated one week later, for a total of two 
immersions per pot. Seeds were then collected from each transformation pot and analyzed 
following the protocol described below. 

20 EXAMPLE V. IDENTIFICATION OF ARABIDOPSIS PRIMARY TRANSFORM ANTS 
Seeds collected from the transformation pots were sterilized essentially as 
follows. Seeds were dispersed into in a solution containing 0.1% (v/v) Triton X-100 (Sigma) and 
sterile H z O and washed by shaking the suspension for 20 min. The wash solution was then 
drained and replaced with fresh wash solution to wash the seeds for 20 min with shaking. After 

25 removal of the second wash solution, a solution containing 0.1% (v/v) Triton X-100 and 70% 
ethanol (Equistar) was added to the seeds and the suspension was shaken for 5 min. After 
removal of the ethanol/detergent solution, a solution containing 0. 1% (v/v) Triton X-100 and 30% 
(v/v) bleach (Clorox) was added to the seeds, and the suspension was shaken for 10 min. After 
removal of the bleach/detergent solution, seeds were then washed five times in sterile distilled 

30 H 2 0. The seeds were stored in the last wash water at 4° C for 2 days in the dark before being 

plated onto antibiotic selection medium (1 X Murashige and Skoog salts (pH adjusted to 5.7 with 
1M KOH), 1 X Gamborg's B-5 vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 
kanamycin). Seeds were germinated under continuous illumination (50-75 nE/m 2 /sec) at 22-23° 
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C. After 7-10 days of growth under these conditions, kanamycin resistant primary transformants 
(T, generation) were visible and obtained. These seedlings were transferred first to fresh 
selection plates where the seedlings continued to grow for 3-5 more days, and then to soil (Pro- 
Mix BX potting medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; kanamycin 
resistant seedlings were selected and analyzed. The expression levels of the recombinant 
polynucleotides in the transformants varies from about a 5% expression level increase to a least a 
100% expression level increase. Similar observations are made with respect to polypeptide level 
expression. 

EXAMPLE VL IDENTIFICATION O F A R ARTDQPSIS PT ANTS WITH TRANSCRIPTION , 
FACTOR GENF- KNOCKOUTS 

The screening of insertion mutagenized Arabidopsis collections for null mutants 
in a known target gene was essentially as described in Krysan et al (1999) Plant Cell 1 1 :2283- 
2290. Briefly, gene-specific primers, nested by 5-250 base pairs to each other, were designed 
from the 5' and 3' regions of a known target gene. Similarly, nested sets of primers were also 
created specific to each of the T-DNA or transposon ends (the "right" and "left" borders). All 
possible combinations of gene specific and T-DN A/transposon primers were used to detect by 
PCR an insertion event within or close to the target gene. The amplified DNA fragments were 
then sequenced which allows the precise determination of the T-DNA/transposon insertion point 
relative to the target gene. Insertion events within the coding or intervening sequence of the 
genes were deconvoluted from a pool comprising a plurality of insertion events to a single unique 
mutant plant for functional characterization. The method is described in more detail in Yu and 
Adam, US Application Serial No. 09/177,733 filed October 23, 1998. 

pvAx mFvn IDENTIFICATION <">E ENVIRONMFNTM STRESS TOLERANCE 
PHRNOTYPE IN OVFREXPRESS OR OR GENE KNOCKOUT PLANTS 

Experiments were performed to identify those transformants or knockouts that 

exhibited an improved environmental stress tolerance. For such studies, the transformants were 
exposed to a variety of environmental stresses. Plants were exposed to chilling stress (6 hour 
exposure to 4-8'C ), heat stress (6 hour exposure to 32-37° C), high salt stress (6 hour exposure to 
200 mM NaCl), drought stress (168 hours after removing water from trays), osmotic stress (6 
hour exposure to 3 M mannitol), or nutrient limitation (nitrogen, phosphate, and potassium) 
(Nitrogen: all components of MS medium remained constant except N was reduced to 20mg/L of 
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NH 4 N03, or Phosphate: All components of MS medium except KH 2 PO 4, which was replaced 
by K2S04, Potassium: All components of MS medium except removal of KN03 and KH2P04, 
which were replaced by NaH4P04). 

Table 3 shows the phenotypes observed for particular overexpressor or knockout 
5 plants and provides the SEQ ID No., the internal reference code (GID), whether a knockout or 
overexpressor plant was analyzed and the observed phenotype. 



Table 3 



SEQ ID No. 


GID 


Knockout (KO) or 
overexpressor (OX) 


Phenotype observed 

_. 


1 


G22 


OE 


Increased tolerance to high salt 


3 


G188 


KO 


Better germination under osmotic stress 


5 


G225 


OE 


Increased tolerance to nitrogen-limited medium 


7 


G226 


OE 


Increased tolerance to nitrogen-limited medium 


9 


G256 


OE 


Better germination and growth in cold 


11 


G419 


OE 


Increased tolerance to potassium-free medium 


13 


G464 


OE 


Better germination and growth in heat 


15 


G482 


OE 


Increased tolerance to high salt 


17 


G502 


KO 


Increased sensitivity to osmotic stress 


19 


G526 


OE 


Increased sensitivity to osmotic stress 


21 


G545 


OE 


Susceptible to high salt 


23 


G561 


OE 


Increased tolerance to potassium-free medium 


25 


G664 


OE 


Better germination and growth in cold 


27 


G682 


OE 


Better germination and growth in heat 


29 


G911 


OE 


Increased growth on potassium-free medium 


31 


G964 


OE 


Better germination and growth in heat 


33 


G394 


OE I 


More sensitive to chilling 


35 


G489 


OE 


Increased tolerance to osmotic stress 



For a particular overexpressor that shows a decreased tolerance to an environmental 
10 stress, it may be more useful to select a plant with a decreased expression of the particular 
transcription factor. For a particular knockout that shows a decreased tolerance to an 
environmental stress, it may be more useful to select a plant with an increased expression of the 
particular transcription factor. 

EXAMPLE VDL IDENTIFICATION OF HOMOLOGOUS SEQUENCES 
1 5 Homologous sequences from Arabidopsis and plant species other than Arabidopsis were 

identified using database sequence search tools, such as the Basic Local Alignment Search Tool 
(BLAST) (Altschul et al. (1990) J. Mol. Biol. 215:403-410; and Altschul et al. (1997) Nucl. Acid 
Res. 25: 3389-3402). The tblastx sequence analysis programs were employed using the 
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BLOSUM-62 scoring matrix (Henikoff, S. and Henikoff, J. 0. (1992) V IS UMJ i ^jm 

89: 10915-10919). . 

Identified homologous sequences are provided in Figure 2 and mcluded n 

the Sequence Listing. Tta percent sequence identity among these sequences is as low as /. 
5 sequenceidennty. Addiuonal.y, the entire NCBI GenBank database was filtered fon sequences 
from all plants except Arabidopsis thaliana by selecting all entries in the NCBI GenBank 
database associated with NCBI taxonomic ID 33090 (Viridiplantae; all plants) and excludmg 
entries associated with taxonomic ID 3701 (Arabidopsis thaliana). These sequences were 
compared to sequencesrepresentinggenesof SEQIDsNo, 1-54 ^ — ^ 

Nos l-54,individualcompans^^^^ 

reflects the probability that a particular alignment occurred by chance. For example, a score of 
3.6e-40 is 3.6 x 10* For up to ten species, the gene with the lowest P-value (and therefore the 
most likely homolog) is listed in Figure 3. 
! 5 in addition to P-values, comparisons were also scored by percentage identrty. Percentage 

identityreflectsmedegreetowmchtwosegmentsofDNAorproteinareident^ 
particular ,ength.Therangesofpercentidenbty between thenon-Arabidopsis^ 
Fig ure3and the Arabidopsis genes in the sequence Usbng are: SEQ ID No. 1 : 53%^ SEQ m 
N o3:38o/.-76o/o;SEQmNo.5:34o/.- 6 7o/ 0 ;SEQmNo.7:50y 
20 SEQmNo.ll:48o/o-66%;SEQIDNo.^ 

17- 65-/o-94o/ 0 ; SEQ m No . 19: 72%-83-/ 0 ; SEQ ID No. 21: 52%-64%; SEQ ID No. 23: 40/o- 
89%; SEQ ID No. 25: 86%-97%; SEQ ID No. 27: 41%-75%; SEQ ID No. 29: 29°/cr72°/o; SEQ ID 
No 31- 49o/o-70o/o; SEQ ID No. 33: 56%-86%; SEQ ID No. 35: 6I0/0-840/0; SEQ ID No. 37: 40* 
58%; SEQmNo.39: 63o/o-8 7 o/o; SEQIDNo.41: 51-/o-88o/o; SEQIDNo. 
25 No. 45: 79o/o-90%; SEQ ID No. 47: 30o/o-58%; SEQ ID No. 49: 52o/o-62o/ 0 ; SEQ ID No. 51: 55 /.- 
73% and SEQ ID No. 53: 44%-80%. 

The polynucleotides and polypeptides in the Sequence Listing and the ,denhfied 
homologous sequences may be stored in a computer system and have associated or linked w,th 
the sequences a function, such as that the polynucleotides and polypeptides are useful for 
30 modifying the environmental stress tolerance of a plant. 

All references, publications, patents and other documents herein are incorporated by 
reference in their entirety for all purposes. Although the invention has been described writ 
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reference to the embodiments and examples above, it should be understood that various 
modifications can be made without departing from the spirit of the invention. 
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What is claimed is: tolerance which plant comprises a 

see: 

, whereN=l-27 or a complementary nucleotide sequence thereof; 
[0 1, where N h e „K eri tutions in anucleotide sequence of (c), 

15 anyof(aKe); sllhseQUence or fragment of any of (aHO. which 

subsequence or fragment encooeb a v 

U«,«e laenrtty .o , polypeptide of SEQ1D No,. 2 W w te ~«- MV- 
ferity ,o a oooserved domain of a polypeptide of SEQ ID No, ZN, 

30 2 T».^.«rf~^^ ,,, ^''' , ~" ,,, ~ 
aco ». promoter operably Imked to said nucleotide seooenoe- 

of: soybean, wheat, com, potato, cotton, nee, ouseed rap , 
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banana, blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, and 
vegetable brassicas. 

5 

4. An isolated or recombinant polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 
SEQ ID Nos. 2N, where N=l-27, or a complementary nucleotide sequence thereof; 
10 (b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 

variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
1, where N=l-27, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 
15 (e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 

sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
any of (a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
20 subsequence or fragment encodes a polypeptide that modifies a plant's environmental 

stress tolerance; 

(h) a nucleotide sequence having at least 30% sequence identity to a nucleotide sequence 
of any of (a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
25 sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 30% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-27; 
(k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N= 1 -27; and 
30 (1) a nucleotide sequence which encodes a conserved domain of a polypeptide having at 

least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, 
where N=l -27. 
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5 The isolated or recombinant polynucleotide of claim 4, further comprising a constitutive, 
inducible, or tissue-active promoter operably linked to the nucleotide sequence. 

6 . Aclonmgorexpressionvectorcomprisingmeisolatedorrecomb^ 
claim 4. 

7. A cell comprising the cloning or expression vector of claim 6. 

g. A transgenic plant comprising the isolated or recombinant polynucleotide of claim 4. 

A composition produced by one or more of: 

(a) incubating one or more polynucleotide of claim 4 with a nuclease; 

(b) incubating one or more polynucleotide of claim 4 with a restriction enzyme; 

(c) incubating one or more polynucleotide of claim 4 with a polymerase; 

(d) incubating one or more polynucleotide of claim4 with a polymerase and a pnmer; 

(e) incubating one or more polynucleotide of claim4 with a cloning vector, or 
(0 incubating one or more polynucleotide of claim 4 with a cell. 



20 



25 



30 



10. A composition comprising two or more different polynucleotides of claim 4. 

, 1 An isolated or recombinant polypeptide comprising a subsequence of at least about 15 
contiguous amino acids encoded by the recombinant or isolated polynucleotide of claim 4. 

12. A plant ectopically expressing an isolated polypeptide of claim 1 1 . 

13 A method for producing a plant having a modifed environmental stress tolerance, the 

4or the expression levels or activity of a polypeptide of c. ai m U in a plant, thereby producmg a 
modified plant, and selecting the modified plant for improved environmental stress tolerance 
thereby providing the modified plant with a modified environmental stress tolerance. 



14. 



The method of claim 13, wherein the polynucleotide is a polynucleotide of claim 4. 
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15. A method of identifying a factor that is modulated by or interacts with a polypeptide 
encoded by a polynucleotide of claim 4, the method comprising: 

(a) expressing a polypeptide encoded by the polynucleotide in a plant; and 

(b) identifying at least one factor that is modulated by or interacts with the polypeptide. 

5 

1 6. The method of claim 15, wherein the identifying is performed by detecting binding by the 
polypeptide to a promoter sequence, or detecting interactions between an additional protein and 
the polypeptide in a yeast two hybrid system. 

10 17. The method of claim 15, wherein the identifying is performed by detecting expression of 
a factor by hybridization to a microarray, subtractive hybridization or differential display. 

18. A method of identifying a molecule that modulates activity or expression of a 
polynucleotide or polypeptide of interest, the method comprising: 

1 5 (a) placing the molecule in contact with a plant comprising the polynucleotide or 

polypeptide encoded by the polynucleotide of claim 4; and, 
(b) monitoring one or more of: 

(i) expression level of the polynucleotide in the plant; 

(ii) expression level of the polypeptide in the plant; 

20 (iii) modulation of an activity of the polypeptide in the plant; or 

(iv) modulation of an activity of the polynucleotide in the plant. 

19. An integrated system, computer or computer readable medium comprising one or more 
character strings corresponding to a polynucleotide of claim 4, or to a polypeptide encoded by the 

25 polynucleotide. 

20. The integrated system, computer or computer readable medium of claim 19, further 
comprising a link between said one or more sequence strings to a modified plant environmental 
stress tolerance phenotype. 

30 

21. A method of identifying a sequence similar or homologous to one or more 
polynucleotides of claim 4, or one or more polypeptides encoded by the polynucleotides, the 
method comprising: 

(a) providing a sequence database; and, 
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(b) querying the sequence database with one or more target sequences corresponding to 
the one or more polynucleotides or to the one or more polypeptides to identify one or 
more sequence members of the database that display sequence similarity or homology to 
one or more of the one or more target sequences. 

5 

22. The method of claim 21, wherein the querying comprises aligning one or more of the 
target sequences with one or more of the one or more sequence members in the sequence 
database. 

10 23. The method of claim 21, wherein the querying comprises identifying one or more of the 
one or more sequence members of the database that meet a user-selected identity criteria with one 
or more of the target sequences. 

24. The method of claim 21, further comprising linking the one or more of the 

15 polynucleotides of claim 4, or encoded polypeptides, to a modified plant environmental stress 
tolerance phenotype. 

25. A plant comprising altered expression levels of an isolated or recombinant polynucleotide 
of claim 4. 



20 



26. A plant comprising altered expression levels or the activity of an isolated or recombinant 
polypeptide of claim 11. 

27. A plant lacking a nucleotide sequence encoding a polypeptide of claim 1 1 . 



25 
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Figure 1 



SEQ ID No. 


GID 


cDNA or protein 


conserved domain 


1 


G22 


cDNA 




2 


G22 


protein 


89-157 


3 


G188 


cDNA 




4 


G188 


protein 


175-222 


5 


G225 


cDNA 




6 


G225 


protein 


39-76 


7 


G226 


cDNA 




8 


G226 


protein 


28-78 


! 9 


G256 


cDNA 




10 


G256 


protein 


13-115 


11 


G419 


cDNA 




12 


G419 


protein 


392-452 


13 


G464 


cDNA 




14 


G464 


protein 


7-15,70-80,125-158,183-219 


15 


G482 


cDNA 




16 


G482 


protein 


25-116 


17 


G502 


cDNA 




18 


G502 


protein 


10-155 


19 


G526 


cDNA 




20 


G526 


protein 


21-149 j 


21 


G545 


cDNA 




22 


G545 


protein 


82-102, 136-154 


23 


G561 


cDNA 




24 


G561 


protein 


248-308 


25 


G664 


cDNA 




26 


G664 


protein 


13-116 


27 


G682 


cDNA 




28 


G682 


protein 


22-53 


29 


G911 


cDNA 




30 


G911 


protein | 


86-129 


31 


G964 


cDNA 




32 


G964 


protein 


126-186 


33 


G394 


cDNA 




34 


G394 


protein 


121-182 


35 


G489 


cDNA 




36 


G489 


protein 


57-156 
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Figure 2 



SEQ ID NolGID 



37 



38 



39 



40 



G463 



G463 



G767 



G767 



homolog of G464 



homolog of G464 



homolog of G502 



homolog of G502 



homolog of G526 



IcPNA or protein 



conserved domain 



cDNA 



protein 



cDNA 



protein 



cDNA 



14-23. 77-88,130-146.194-227 



8-158 



42 



43 



44 



45 



46 



47 



48 



49 



G765 
G765 



homolog of G526 



protein 



G197 



homolog of G664 



cDNA 



G197 



homolog of G664 



protein 



G255 
G255 



homolog of G664 



cDNA 



homolog of G664 



protein 



G1113 



homolog of G911 



cDNA 



G1113 



homolog of G911 



G398 



homolog of G964 



proteir 



cDNA 



G398 



homolog of G964 



homolog of G394 



prote 



in 



cDNA 



14-119 



14-115 



85-128 



128-191 



72-135 
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Figure 3A 



SEQ ID No. 


GID 


Genbank NID 


P-value 


Species 


1 


G22 


790359 


1.00E-45 


Nicotiana tabacum 


1 


G22 


3342210 


6.60E-45 


Lycopersicon esculentum 


1 


G22 


6654776 


1.60E-44 


Medicago truncatula j 


1 


G22 


8809570 


5.80E-44 


Nicotiana sylvestris 


1 


G22 


7627061 


2.40E-39 


Gossypium arboreum 


1 


G22 


7324479 


9.50E-36 


Lycopersicon pennellii 


1 


G22 


8980312 


4.30E-31 


Catharanthus roseus 


1 


G22 


7528275 


1 .20E-30 


Mesembryanthemum crystallinum 


1 


G22 


6478844 


| 4.60E-28 


Matricaria chamomilla 


1 


G22 


6847348 


5.90E-26 


Glycine max 


3 


G188 


7779802 


5.20E-36 


Lotus japonicus 


3 


G188 


7284340 


2.10E-34 


Glycine max 


3 


G188 


! 9361307 


1.20E-27 


Triticum aestivum 


3 


G188 


7340336 


1.10E-22 


Oryza sativa 


3 


G188 


6529152 


3.60E-22 


Lycopersicon esculentum 


3 


G188 


8748477 


7.70E-21 


Medicago truncatula 


3 


G188 


5456433 


7.10E-14 


Zea mays J 


3 


G188 


9302479 


1.60E-12 


Sorghum bicolor 


3 


G188 


6696287 


4.10E-12 


Pinus taeda 


3 


G188 


i 562242 


9.00E-12 


Brassica rapa 


5 


G225 


4396287 


4.40E-16 


Glycine max 


5 


G225 


309571 


0.00029 


Zea mays 


5 


G225 


3857004 


0.001 


Populus tremula x Populus tremuloides 


5 


G225 


9410205 


0.019 


Triticum aestivum 


5 


G225 


9426190 


0.025 


Triticum turqidum subsp. durum 


5 


G225 


8382118 


0.046 


Gossypium arboreum 


5 


G225 


6782756 


0.27 


Oryza sativa j 


5 


G225 


7721017 


0.4 


Lotus japonicus 


5 


G225 


6020136 


0.47 


Pinus taeda 


5 


G225 


2921331 


0.48 


Gossypium hirsutum 


7 


G226 


4396287 


5.10E-15 


Glycine max 


7 


G226 


9410205 


1.50E-05 


Triticum aestivum 


7 


G226 


3857004 


0.11 


Populus tremula x Populus tremuloides 


7 


G226 


2428139 


0.35 


Oryza sativa 


9 


G256 


1430847 


1.30E-72 


Lycopersicon esculentum 


9 


G256 


9252441 


1.20E-65 


Solanum tuberosum 


9 


G256 


8380712 


2.20E-58 


Gossypium arboreum 


9 


G256 


8172976 


1.60E-54 


Medicago truncatula 


9 


G256 


9205295 


1.30E-44 


Glycine max 


9 


G256 


20562 


6.40E-40 


Petunia x hybrida 


9 


G256 


4886263 


4.40E-37 


Antirrhinum majus 


9 


G256 


6552360 


5.00E-36 


Nicotiana tabacum 


9 


G256 


2312003 


1 .20E-35 


Oryza sativa 


9 


G256 


5268628 


5.20E-35 


Zea mays 


11 


G419 


7239156 


2.60E-59 


Malus x domestica 


11 


G419 


5278451 


9.00E-58 


Lycopersicon esculentum 


11 


G419 


9205496 


1 .30E-55 


Glycine max 


11 


G419 


7628137 


9.30E-51 


Gossypium arboreum 


11 


G419 


6069643 


9.50E-51 


Oryza sativa 


11 


G419 


7562931 


9.80E-45 


Medicago truncatula 


11 


G419 


7322293 


2.30E-37 


Lycopersicon hirsutum 


11 


G419 


8404716 


1.10E-29 


Hordeum vulgare 


11 


G419 


7217755 


1.40E-29 


Sorghum bicolor 
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Figure 3B 
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Figure 3C 



SEQ ID No 


GID iGenbankNIDl P-value ISpecies 


91 

zo 


G561 


1033058 


5.90E-65 


Raphanus sativus 


zo 


G561 [ 


2815304 


2.10E-35 


Spinacia oleracea 


ZO 


G561 


1498300 


1.60E-34 


Petroselinum crispum 


91 


G561 


169958 


8.10E-32 


Glycine max 


91 

zo 


G561 


5381310 


2.20E-30 


Catharanthus roseus 


91 


G561 


1155053 


9.70E-28 


Phaseolus vulgaris 


9** 


G561 


728627 


1 .90E-27 


Nicotiana tabacum ! 


91 
zo 


G561 


7565950 


1.40E-21 


Medicago truncatula 


9*\ 
ZO 


G664 


1167483 


4.90E-81 j 


Lycopersicon esculentum 


9^ 
ZO 


G664 


7765706 i 


6.30E-69 


Medicago truncatula 


95 


G664 


19052 


9.30E-68 


Hordeum vulgare | 


1 95 
Zj 


G664 


7626566 I 


4.00E-67 


Gossypium arboreum 


95 
zo 


G664 


5050757 


2.60E-66 


Gossypium hirsutum 


95 

ZQ 


I G664 


6850206 


6.90E-66 


Oryza sativa 


95 
Zj 


G664 


6667606 


2.20E-63 


Glycine max 


95 

ZO 


G664 


517492 


9.30E-62 


Zea mays 


95 


G664 


9302672 


1.50E-59 


Sorghum bicolor 


r 9*i 

ZD 


G664 


5860031 


9.20E-58 


Pinus taeda 


Zf 


G682 


309571 


4.40E-08 


Zea mays 


Zf 


G682 


4396287 


1.10E-05 


Glycine max 


Zf 


G682 


3857004 


0.00051 


Populus tremula x Populus tremuloides 


Zr 


G682 


9410205 


0.00085 


Triticum aestivum 


Zf 


G682 


€382118 


0.0079 


Gossypium arboreum 


07 

Zf 


G682 


2428139 


0.017 


Oryza sativa 


97 
Zf 


G682 


7339148 


0.13 


Lycopersicon esculentum 


97 
Zf 


G682 


9302672 


0.32 


Sorghum bicolor 


97 
Zf 


G682 


5048991 


0.39 


Gossypium hirsutum 


97 
Zf 


G682 


6555777 


0.46 


Pinus taeda 


9Q 

zy 


G911 


4090113 


6.10E-51 


Brassica napus 


9Q 


G911 


5893315 


7.70E-25 


Lycopersicon esculentum 


9Q 

zy 


G911 


5048452 


3.10E-23 


Gossypium hirsutum 


9Q 

zy 


G911 


9440241 


1.90E-21 


Glycine max 


9Q 

zy 


G911 


6917169 


1 .80E-1 1 


Lycopersicon pennellii 


9Q 

zy 


G911 


9297970 


3.20E-11 


Sorghum bicolor 


9Q 

zy 


G911 


7137594 


4.90E-11 


Zea mays 


9Q 
zy 


G911 


9278447 


4.60E-10 


Lotus japonicus ! 


9Q 
zy 


G911 


7560271 


7.20E-10 


Medicago truncatula 


9Q 

zy 


G911 


5043346 


4.50E-09 


Sorghum halepense 


Ol 


G964 


7624806 


3.30E-72 


Gossypium arboreum 


Ol 


G964 


1234899 


9.10E-66 


Glycine max 


H 
Ol 


G964 


1149534 


1 .50E-61 


Pimpinella brachycarpa 


Ol 


G964 


8919872 


3.40E-51 


Capsella rubella 


11 
Ol 


G964 


yyzoyf 


£ 7f\C 51 

O.f ut-oi 


1 wrnnorcirrin pen ilortft im 

Lycuperbiuui 1 tjo^uiciHuiu 


31 


G964 


1235564 


1.50E-38 


Oryza sativa 


31 


| G964 


6605613 


3.00E-32 


Medicago truncatula 


31 


G964 


1032371 


4.50E-28 


Helianthus annuus 


31 


G964 


3868846 


2.80E-25 


Ceratopteris richardii 


31 


G964 


8088109 


6.40E-22 


Sorghum bicolor 


33 


G394 


8670502 


7.90E-59 


Glycine max 


33 


G394 


3171738 


2.00E-54 


Craterostigma plantagineum 


33 


G394 


1032371 


1.10E-50 


Helianthus annuus 


33 


G394 


7624806 


4.30E-47 


Gossypium arboreum 


33 


G394 


1160483 


2.10E-46 


Pimpinella brachycarpa 



re 2 
Idn 

33 

33 

33 

33 

33 

35 

35 

35 

35 

35 

35 

35 

35 

35 

35 

37 

37 

37 

37 

37 

37 

37 

37 

37 

37 

39 

39 

39 

39 

39 

39 

39 

39 

39 

39 

41 

41 

41 

41 

41 

41 

41 

41 

41 

41 

43 

43 
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GID iGenbankNIPl P-value 



ftttu I 3868846 4.Z0E^45 



G394 



992597 



1.10E-44 



r^QA 7558511 I 1.50E-44 



frtQA 8099247 6.2Qfc-43 



mQA 8919872 1.2Q&-4U 



Species 



Ceratopteris richardii 



Lycopersicon esculentum 



Medicago truncatula 



Or yza sativa 



Capsella rubella 



G489 



6534956 4.40E-62 Lycopersicon esculentum 



hdftQ 9055852 2.60E-60 Medicago truncatula 



aw 8382393 6.20E-51 IGossypium arboreum 



fMRQ 8789169 2.10E-50 Citrus x paradisi 



G489 



9252957 



1 .50E-47 ISolanum tuberosum 
4.7QE-47 I Lycopersicon pennellT 



G489 



6918056 



G489 



7590809 



5257255 



1.00E-46 
8.60E-43 



Glycine max 



I Oryza sativa 



G489 



4152190 



G489 



6069260 



3.20E-41 
2.10E-39 



Zea mays 



lCeratodon purpureus 



KAfn 6527230 4.90E-36 Lycopersicon esculentum 



km* 9305572 i 5.50E-36 [Sorghum bicojor 



n4fi3 3760881 1.20E-31 [Oryza sativa 



G463 



6604917 



1.30E-23 



G463* 5058123 I 2.50E-21 Glyti 



Medicago truncatula 
ine max 



5044476 



1.10E-19 



G463 



9412603 



1.70E-17 



Gossypium hirsutum 
Triticum aestivum 



nAM 9419394 6.00E-17 Hordeum vuigare 



ttAfia 7624108 6.20E-17 IGossypium arboreum 



'^463 8547152 3.20E-16 [Nicotiana tabacum 



G767 



5510359 



G767 I 7643155 



2.80E-76 
4.20E-74 



Glycine max 



I Medicago truncatula 



G767 
G767 



6977319 
6730939 



1 .10E-72 [Lycopersicon esculentum 
4 2QE-6 8 Oryza sativa" 



tt7R7 7502501 2.00E-67 IGossypium arboreum 



^767 9302206 3.10E-65 [Sorghum bicoior 



G767 4218534 4.30E-51 [Triticum sp» 
n7R7 6732157 4.30E-51 



G767 



9412602 



6.90E-47 



Triticum monococcum 
Uriticum aestivum 



any 8329134 1.30E-46 Mesembryanthemum crystallinum 



G765 



4384535 



G765 



6454868 



3.10E-56 
8.50E-56 



Lycopersicon esculentum 



Glycine 



max 



^7fi5 1279639 4.30E-53 [Petuniax hybrida 



n?fis 4977542 2.00E-51 [Oryza sativa 



^765 I 4218536 I 2.00E-50 Triticum sp 



7^765 6732159 2.00E-50 [Triticum monococcum 



G765 
G765 



5049217 
9361647 



6.90E-50 
4.50E-49 



Gossypium hirsutum 
Triticum aestivum 



r;7fi5 9296257 2.90E-48 [Sorghum bicoior 



r;765 8708684 4.30E-46 [Hordeum vuigare 



fti Q7 1167483 2.70E-76 Lyco persicon esculentum 
niQ7 7626566 2.40E-73 Gossypium arboreum 



13197 I 7765706~j~ -50E-63 Medicago truncatula 



G197 



19052 



8.90E-63 [Hordeum vuigare 



G197 



5050757 



G197 



6850206 



1.60E-62 
1.10E-61 



[Gossypium hirsutum 



G197 
G197 



6667606 
517492 



1.70E-61 
7.60E-59 



Oryza sativa 
Glycine max 
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Figure 3E 



SEQ ID No. 


GID 


Genbank NID 


P-value 


Species 


f~ 43 


G197 


5860031 


3.90E-57 


Pinus taeda 


43 


G197 


9302672 


3.80E-55 


Sorghum bicolor 


45 


G255 


1167483 


6.40E-75 


Lycopersicon esculentum 


45 


G255 


7626566 


6.40E-71 


Gossypium arboreum 


45 


G255 


19050 


2.80E-65 


Hordeum vulgare 


45 


G255 


5050757 


3.70E-63 


Gossypium hirsutum 


45 


G255 


7590249 


4.10E-62 


Glycine max 


45 


G255 


7765706 


4.40E-62 


Medicago truncatula 


45 


G255 


6850206 


1.10E-61 


Oryza sativa 


45 


G255 


517492 


3.50E-59 


Zea mays ,...J 


45 


G255 


9302672 


1.60E-56 


Sorghum bicolor 


45 


G255 


7721017 


2.60E-55 


Lotus japonicus 


47 


G1113 


.4090113 


2.30E-36 


Brassica napus 


47 


G1113 


5048452 


6.80E-12 


Gossypium hirsutum 


i 47 


G1113 


5893315 i 


9.50E-11 


Lycopersicon esculentum 


47 


G1113 


9440241 


7.70E-09 


Glycine max 


49 


G398 


7624806 


2.80E-67 


Gossypium arboreum 


49 


G398 


1234899 


6.90E-64 


Glycine max 


49 


G398 


1149534 


6.20E-63 


Pimpinella brachycarpa 


49 


G398 


8919872 


2.60E-47 


Capselia rubella 


49 


G398 


992597 


1.10E-39 


Lycopersicon esculentum 


49 


G398 


1235564 


7.70E-39 


Oryza sativa 


49 


G398 


6605613 


1.70E-33 


Medicago truncatula 


49 


G398 


8088109 


3.60E-33 


Sorghum bicolor 


49 


G398 


3868846 


1.60E-32 


Ceratopteris richardii 


49 


G398 


3171738 


1.00E-27 


Craterostigma plantagineum 


51 


G395 


992597 


5.30E-51 


Lycopersicon esculentum 


51 


G395 


7624806 


2.00E-50 


Gossypium arboreum 


51 


G395 


1234899 


1.50E-49 


Glycine max 


51 


G395 


1165131 


1.90E-48 


Pimpinella brachycarpa 


51 


G395 


3868846 


3.40E-47 


Ceratopteris richardii 


51 


G395 


7415619 


1.30E-41 


Physcomitrella patens 


51 


G395 


8919872 


7.40E-41 


Capselia rubella 


51 


G395 


1235564 


2.70E-38 


Oryza sativa 


51 


G395 


8088109 


2.30E-33 


Sorghum bicolor 


51 


G395 


1032371 


3.30E-31 


Helianthus annuus 


53 


G393 


8670502 


3.60E-55 


Glycine max 


53 


G393 


9199975 


7.60E-46 


Medicago truncatula 


53 


G393 


3868846 


9.60E-37 


Ceratopteris richardii 


53 


G393 


8919872 


2.50E-35 


Capselia rubella 


53 


G393 


7624806 


1.30E-34 


Gossypium arboreum 


53 


G393 


7415619 


1.00E-33 


Physcomitrella patens 


53 


G393 


5897000 


5.50E-33 


Lycopersicon esculentum 


53 


G393 


1235564 


4.00E-32 


Oryza sativa 


53 


G393 


1165131 


6.40E-32 


Pimpinella brachycarpa ! 


53 


G393 


3171738 


1.50E-31 


Craterostigma plantagineum 



WO 01/36598 



PCT/USOO/31458 



MBI16 Sequence Listing. ST25 
SEQUENCE LISTING 

<110> Pineda, Omaira 
Yu, Guo-Liang 
Creelman, Robert 
Riechmann, Jose Luis 
Heard, Jacqueline 
Ratcliffe, Oliver 
Reuber, Lynne 
Keddie, James 

<120> Environmental Stress Tolerance Genes 
<130> MBI-0016 

<150> 60/166,228 
<151> 1999-11-17 

<150> 60/197,899 

<151> 2000-04-17 

<150> Plant Trait Modification III 

<151> 2000-08-22 

<160> 54 

<170> Patentln version 3.0 

<210> 1 

<211> 913 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (81)-. (761) 

<223> G22 

agaaaacatc tctcactctc taaaatacac actctcatca aaaaccttct cttcggttca 60 

-~ =,t-n aor tea tct qat tec gtt aat aac ggc gtt 113 
gaagcattca agaatccatt atg age tea tct g^ ^ ^ ^ ^ 

1 5 10 



K B 21 a B S 21 s a K £ is s s a s 

15 20 

S B = 3 E B £2 S3 B S 13 K B S K B 

S K ffi B k s = 23 B IS & S is a; 58 s 

45 50 55 

s is «s s s e s is s is e ss b si £ a 



13 S 5! S S SJ IS S B B 21 3 B W S S 

80 8b 

B S « a 21 21 21 SI IJ I?? B - B B B S 
Sf B SI B B 25 if? E 21 C B K 1 S B B 

110 115 
acg ccg gag gac gcg gcg gtg gcg tac gac cga gcg gcg ttt cag etc 

Page l 



161 



209 



257 



305 



353 



401 



449 



497 
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MBI16 Sequence Listing. ST25 
Thr Pro Glu Asp Ala Ala Val Ala Tyr Asp Arg Ala Ala Phe Gin Leu 
125 130 135 

aga gga teg aaa get aag ctg aat ttt ccg cat ttg att ggt tct tgt 545 
Arg Gly Ser Lys Ala Lys Leu Asn Phe Pro His Leu He Gly Ser Cys 
140 145 150 155 

aag tat gag ccg gtt agg att agg cct cgc cgt cgc teg ccg gaa ccg 593 
Lys Tyr Glu Pro Val Arg lie Arg Pro Arg Arg Arg Ser Pro Glu Pro 
160 165 170 

tea gtc tec gat cag tta acg teg gag cag aag agg gaa age cac gtg 641 
Ser Val Ser Asp Gin Leu Thr Ser Glu Gin Lys Arg Glu Ser His Val 
175 180 185 

gat gac ggc gag tct agt ttg gtt gta ccg gag ttg gat ttc acg gtg 689 
Asp Asp Gly Glu Ser Ser Leu Val Val Pro Glu Leu Asp Phe Thr Val 
190 195 200 

gat cag ttt tac ttc gat ggt agt tta tta atg gac caa tea gaa tgt 737 
Asp Gin Phe Tyr Phe Asp Gly Ser Leu Leu Met Asp Gin Ser Glu Cys 
205 * 210 215 

tct tat tct gat aat egg ata taa ttagttttaa gattaagcaa aatttgtcca 791 
Ser Tyr Ser Asp Asn Arg He 
220 * 225 

acgagttttg ctgtatgaaa tatctatcga tgactcaaca ggttttgatc atgatcatat 851 

gtaatgtgat ggaaattaaa tattgaegtt tgtttttttg ttgtaaaaaa aaaaaaaaaa 911 

aa 913 

<210> 2 
<211> 226 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 2 

Met Ser Ser Ser Asp Ser Val Asn Asn Gly Val Asn Ser Arg Met Tyr 
15 10 15 

Phe Arg Asn Pro Ser Phe Ser Asn Val He Leu Asn Asp Asn Trp Ser 
20 25 30 

Asp Leu Pro Leu Ser Val Asp Asp Ser Gin Asp Met Ala He Tyr Asn 
35 40 45 

Thr Leu Arg Asp Ala Val Ser Ser Gly Trp Thr Pro Ser Val Pro Pro 
50 55 60 

Val Thr Ser Pro Ala Glu Glu Asn Lys Pro Pro Ala Thr Lys Ala Ser 
65 70 75 80 

Gly Ser His Ala Pro Arg Gin Lys Gly Met Gin Tyr Arg Gly Val Arg 
85 90 95 

Arg Arg Pro Trp Gly Lys Phe Ala Ala Glu He Arg Asp Pro Lys Lys 
100 105 HO 

Asn Gly Ala Arg Val Trp Leu Gly Thr Tyr Glu Thr Pro Glu Asp Ala 
115 120 125 

Ala Val Ala Tyr Asp Arg Ala Ala Phe Gin Leu Arg Gly Ser Lys Ala 
130 135 140 

Page 2 
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MBI16 Sequence Listing. ST25 

Lys Leu Asn Phe Pro His Leu He Gly Ser Cys Lys Tyr Glu Pro val 
145 150 155 160 

Arg He Arg Pro Arg Arg Arg Ser Pro Glu Pro Ser Val Ser Asp Gin 
165 170 I 75 

Leu Thr Ser Glu Gin. Lys Arg Glu Ser His Val Asp Asp Gly Glu Ser 
180 185 13° 

Ser Leu Val Val Pro Glu Leu Asp Phe Thr Val Asp Gin Phe Tyr Phe 
195 200 205 

Asp Gly Ser Leu Leu Met Asp Gin Ser Glu Cys Ser Tyr Ser Asp Asn 
210 215 220 

Arg He 
225 

<210> 3 
<211> 1195 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS* 

<222> (50) . . (1096) 

<223> G188 

ctctcaccaa cataatcaaa gaagctttcc tcacgaattc aagatcgcc atg tec tec 58 

Met Ser ser 
1 



106 



gag gat tgg gat etc ttc gec gtc gtc aga age tgc age tct tct gtt 
Glu Asp Trp Asp Leu Phe Ala Val Val Arg Ser Cys Ser Ser Ser Val 
5 10 15 

tec ace ace aat tct tgt get ggt cat gaa gac gac ata gga aac tgt 
Ser Thr Thr Asn Ser Cys Ala Gly His Glu Asp Asp He Gly Asn Cys 
20 25 30 35 

aaa caa caa caa gat cct cct cct cct cct ctg ttt caa get tct tct 
Lys Gin Gin Gin Asp Pro Pro Pro Pro Pro Leu Phe Gin Ala Ser Ser 
40 45 50 

tct tgc aac gag tta caa gat tct tgc aaa cca ttt tta ccc gtt act 
Ser Cys Asn Glu Leu Gin Asp Ser Cys Lys Pro Phe Leu Pro Val Thr 
55 60 65 

act act act act act act tgg tct cct cct cct eta ctt cct cct cct 
Thr Thr Thr Thr Thr Thr Trp Ser Pro Pro Pro Leu Leu Pro Pro Pro 
70 75 80 

aaa gee tea tea cca tct ccc aat ate tta eta aaa caa gaa caa gta 
Lys Ala Ser Ser Pro Ser Pro Asn He Leu Leu Lys Gin Glu Gin Val 
85 90 95 

ctt etc gaa tea caa gat caa aaa cct cct ctt agt gtt agg gtt ttc 394 
Leu Leu Glu Ser Gin Asp Gin Lys Pro Pro Leu Ser Val Arg Val Phe 
100 105 HO US 

cca cca tec act tct tct tct gtc ttt gtt ttt aga ggt caa cgc gac 442 
Pro Pro Ser Thr Ser Ser Ser Val Phe Val Phe Arg Gly Gin Arg Asp 
120 125 130 

cag ctt ctt caa caa caa tec caa cct ccc ctt cga tct aga aaa aga 490 
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154 



202 



250 



298 



346 
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Gin Leu Leu Gin Gin Gin Ser Gin Pro Pro Leu Arg Ser Arg Lys Arg 

135 140 145 

aag aat cag caa aaa aga acc ata tgt cat gta acg caa gag aat ctt 538 

Lys Asn Gin Gin Lys Arg Thr lie Cys His Val Thr Gin Glu Asn Leu 
150 155 160 

tct tct gat ttg tgg get tgg cgt aaa tac ggt caa aaa ccc ate aaa 586 

Ser Ser Asp Leu Trp Ala Trp Arg Lys Tyr Gly Gin Lys Pro lie Lys 
165 " 170 175 

ggc tct cct tat cca agg aat tat tac aga tgt agt age tea aaa gga 634 

Gly Ser Pro Tyr Pro Arg Asn Tyr Tyr Arg Cys Ser Ser Ser Lys Gly 
180 185 190 195 



tgt tta gca cga aaa caa gtt gaa aga agt aat tta gat cct aat ate 
Cys Leu Ala Arg Lys Gin Val Glu Arg Ser Asn Leu Asp Pro Asn lie 
200 205 210 



682 



ttc ate gtt act tac acc gga gaa cac act cat cca cgt cct act cac 730 
Phe He val Thr Tyr Thr Gly Glu His Thr His Pro Arg Pro Thr His 
215 220 225 

egg aac tct etc gec gga agt act cgt aac aaa tct cag ccc gtt aac 778 
Arg Asn Ser Leu Ala Gly Ser Thr Arg Asn Lys Ser Gin Pro Val Asn 
230 235 240 

ccg gtt cct aaa ccg gac aca tct cct tta teg gat aca gta aaa gaa 826 
Pro Val Pro Lys Pro Asp Thr Ser Pro Leu Ser Asp Thr Val Lys Glu 
245 J 250 255 

gag att cat ctt tct ccg acg aca ccg ttg aaa gga aac gat gac gtt 874 
Glu He His Leu Ser Pro Thr Thr Pro Leu Lys Gly Asn Asp Asp Val 
260 265 270 275 

caa gaa acg aat gga gat gaa gat atg gtt ggt caa gaa gtc aac atg 922 
Gin Glu Thr Asn Gly Asp Glu Asp Met Val Gly Gin Glu Val Asn Met 
280 285 290 

gaa gag gaa gag gag gaa gaa gaa gtg gaa gaa gat gat gaa gaa gaa 970 
Glu Glu Glu Glu Glu Glu Glu Glu Val Glu Glu Asp Asp Glu Glu Glu 
295 300 305 

gaa gat gat gat gac gtg gat gat ctt ttg ata cca aat tta gcg gtg 1018 
Glu Asp Asp Asp Asp Val Asp Asp Leu Leu He Pro Asn Leu Ala Val 
310 315 320 

aga gat cga gat gat ttg ttc ttc get gga agt ttt cca tct tgg tec 1066 
Arg Asp Arg Asp Asp Leu Phe Phe Ala Gly Ser Phe Pro Ser Trp Ser 
325 330 335 

gec gga tec gec ggt gac ggt ggt gga tga tgaaaacgaa taaaatctca 1116 
Ala Gly Ser Ala Gly Asp Gly Gly Gly 
340 345 

atttacaatt tacaaaaaga aaaaagtcag tttttaatta ttatttttgt ttgttaaaac 1176 
ttgacattta ttgtgttat 1195 

<210> 4 
<211> 348 
<212> PRT 

<213> Arabidopsis t ha liana 
<400> 4 

Met Ser Ser Glu Asp Trp Asp Leu Phe Ala Val Val Arg Ser Cys Ser 
15 10 15 

Ser Ser Val Ser Thr Thr Asn Ser Cys Ala Gly His Glu Asp Asp He 
20 25 30 
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MBI16 Sequence Listing.ST25 
Gly Asn Cys Lys Gin Gin Gin Asp Pro Pro Pro Pro Pro Leu Phe Gin 
35 40 45 

Ala Ser Ser Ser Cys Asn Glu Leu Gin Asp Ser Cys Lys Pro Phe Leu 
50 55 60 

Pro Val Thr Thr Thr Thr Thr Thr Thr Trp Ser Pro Pro Pro Leu Leu 



65 



70 75 80 



Pro Pro- Pro Lys Ala Ser Ser Pro Ser Pro Asn lie Leu Leu Lys Gin 
85 90 95 

Glu Gin Val Leu Leu Glu Ser Gin Asp Gin Lys Pro Pro Leu Ser Val 
100 1° 5 110 

Arg val Phe Pro Pro Ser Thr Ser Ser Ser Val Phe Val Phe Arg Gly 
y 115 120 125 

Gin Arg Asp Gin Leu Leu Gin Gin Gin Ser Gin Pro Pro Leu Arg Ser 
130 135 140 

Arg Lys Arg Lys Asn Gin Gin Lys Arg Thr lie Cys His Val Thr Gin 
X45 150 I 55 

Glu Asn Leu Ser Ser Asp Leu Trp Ala Trp Arg Lys Tyr Gly Gin Lys 
165 17° 175 

Pro He Lys Gly Ser Pro Tyr Pro Arg Asn Tyr Tyr Arg Cys Ser Ser 
180 I 85 l * 

Ser Lys Gly Cys Leu Ala Arg Lys Gin Val Glu Arg Ser Asn Leu Asp 
195 200 205 

Pro Asn lie Phe lie Val Thr Tyr Thr Gly Glu His Thr His Pro Arg 
210 215 220 

Pro Thr His Arg Asn Ser Leu Ala Gly Ser Thr Arg Asn Lys Ser Gin 
225 ~ 230 235 

Pro Val Asn Pro Val Pro Lys Pro Asp Thr Ser Pro Leu Ser Asp Thr 
245 250 <=" 

Val Lys Glu Glu He His Leu Ser Pro Thr Thr Pro Leu Lys Gly Asn 
260 265 270 

Asp Asp Val Gin Glu Thr Asn Gly Asp Glu Asp Met Val Gly Gin Glu 
275 280 28 => 

Val Asn Met Glu Glu Glu Glu Glu Glu Glu Glu Val Glu Glu Asp Asp 
290 295 300 

Glu Glu Glu Glu Asp Asp Asp Asp Val Asp Asp Leu Leu He Pro Asn 
305 310 315 320 

Leu Ala Val Arg Asp Arg Asp Asp Leu Phe Phe Ala Gly Ser Phe Pro 
325 330 

Page 5 



WO 01/36598 



PCT/US00/31458 
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Ser Trp Ser Ala Gly Ser Ala Gly Asp Gly Gly Gly 
340 345 



<210> 


5 


<211> 


584 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(157) . . (441) 


<223> 


G225 


<400> 


5 



ctctctctct cactcttttc ttttccgaga acccaacaaa aaaaaagcta ctattaatcc 60 

ttcccctcgt gaggaaatca tttcttcttg tttctcgaga tttattctct ttctctctct 120 

ctttctctgt gtgtttcgtg tcttcagatt agttcg atg ttt cgt tea gac aag 174 

Met Phe Arg Ser Asp Lys 
1 5 

gcg gaa aaa atg gat aaa cga cga egg aga cag age aaa gee aag get 222 
Ala Glu Lys Met Asp Lys Arg Arg Arg Arg Gin Ser Lys Ala Lys Ala 
10 * 15 20 



tct tgt tec gaa gag gtg agt agt ate gaa tgg gaa get gtg aag atg 
Ser Cys Ser Glu Glu Val Ser Ser He Glu Trp Glu Ala Val Lys Met 
25 30 35 



270 



tea gaa gaa gaa gaa gat etc att tct egg atg tat aaa etc gtt ggc 318 
Ser Glu Glu Glu Glu Asp Leu lie Ser Arg Met Tyr Lys Leu Val Gly 
40 45 50 

gac agg tgg gag ttg ate gee gga agg ate ccg gga egg acg ccg gag 366 
Asp Arg Trp Glu Leu He Ala Gly Arg He Pro Gly Arg Thr Pro Glu 
55 60 65 70 

gag ata gag aga tat tgg ctt atg aaa cac ggc gtc gtt ttt gee aac 414 
Glu He Glu Arg Tyr Trp Leu Met Lys His Gly Val Val Phe Ala Asn 
75 80 85 

aga cga aga gac ttt ttt agg aaa tga ttttttttgt ttggattaaa 461 
Arg Arg Arg Asp Phe Phe Arg Lys 
90 

agaaaatttt cctctcctta attcacaaga caagaaaaaa aggaaatgta cctgtccttg 521 
aattactatt ttggaatgta taattatcta tatatataag aagaaaaaat tgcttaggaa 581 
ttt 

<210> 6 
<211> 94 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 6 

Met Phe Arg Ser Asp Lys Ala Glu Lys Met Asp Lys Arg Arg Arg Arg 
1 5 10 15 

Gin Ser Lys Ala Lys Ala Ser Cys Ser Glu Glu Val Ser Ser He Glu 
20 25 30 

Trp Glu Ala Val Lys Met Ser Glu Glu Glu Glu Asp Leu He Ser Arg 
35 40 45 



584 
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Met Tyr Lys Leu Val Gly Asp Arg Trp Glu Leu lie Ala Gly Arg He 
50 



55 60 
Pro Gly Arg Thr Pro Glu Glu He Glu Arg Tyr Trp Leu Met Lys His 



65 



70 



Gly Val Val Phe Ala Asn Arg Arg Arg Asp Phe Phe Arg Lys 



85 



<210> 7 

<211> 407 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (10).. (348) 

<223> G226 

<400> 7 _ ran aat CCC aqt 51 



S5Ui a s « s - • E s S S S « - 85 

1 5 

ctt agg caa act aag ttc act cga tec cga tat gac tct gaa gaa gtg 
Leu A?i Gin Thr Lys Phe Thr Arg Ser Arg Tyr Asp Ser Glu Glu Val 
15 20 25 

s s s sr. s is ss ss s s s e s k e e 

35 40 

s s s s a s s s « « ss si ss in s "* 

50 55 

gca gga aga gtc gta gga aga aag gca aat gag att gag aga tac tgg 
Ala Gly Arg Val Val Gly Arg Lys Ala Asn Glu He Glu Arg lyr v 
65 70 75 

S S S K S K £ - S K B 3 S S S SS 



8S 90 

K S 55 - £ S S S S & K 25 S E £ S 

95 100 10b 

aaa ttg taa agaaatcaaa ataaaagctt tcaatcataa aagtagaaca 

Lys Leu 

aatcttgaat gtcttctca 



99 



147 



195 



243 



291 



339 



388 



407 



<210> 8 
<211> 112 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 8 

Met Asp Asn Thr Asn Arg Leu Arg Leu Arg Arg Gly Pro Ser Leu Arg 

Gin Thr Lys Phe Thr Arg Ser Arg Tyr Asp Ser Glu Glu Val Ser Ser 
20 25 

lie Glu Trp Glu Phe He Ser Met Thr Glu Gin Glu. Glu Asp Leu lie 
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35 40 45 

Ser Arg Met Tyr Arg Leu Val Gly Asn Arg Trp Asp Leu He Ala Gly 
50 55 60 

Arg Val Val Gly Arg Lys Ala Asn Glu He Glu Arg Tyr Trp He Met 
65 70 75 80 

Arg Asn Ser Asp Tyr Phe Ser His Lys Arg Arg Arg Leu Asn Asn Ser 
85 90 95 

Pro Phe Phe Ser Thr Ser Pro Leu Asn Leu Gin Glu Asn Leu Lys Leu 
100 105 HO 

<210> 9 
<211> 1547 
<212> DNA 
<213> Arabidopsis 

<220> 

<221> CDS 

<222> (312) . . U310) 

<223> G256 

<400> 9 

tcgtgagcgt tgtgtttctc ctcaacattc aaagtcttta gtgaaacctc tcttgtaaga 60 

agccaaaaaa ataaagagaa agattcaaag aaggaaagaa attgaggatg actatttcaa 120 

gtccaaagag agattttgag tagaccctct tcacaaaaat ccaatcttag agtcttacta 180 

gttactatct agcttacata cacagagaca ctataccaaa aatccaatct tattagagta 240 
cttactatat agcttacaca tacacacaca cgaagtacta tttcaacgat caagagcgtg 



tgcgtgagga t atg ggt aga cca cct tgt tgc gag aag att gag gtg aag 
Met Gly Arg Pro Pro Cys Cys Glu Lys He Glu Val Lys 
1 * ' 5 10 



ate ate cac ctt caa get ctt ttg gga aat aga tgg gca get ata gca 
lie lie His Leu Gin Ala Leu Leu Gly Asn Arg Trp Ala Ala lie Ala 
80 85 90 



300 
350 



aaa gga cca tgg act ccc gaa gaa gac ata ate ttg gtc tct tat ate 398 

Lys Gly Pro Trp Thr Pro Glu Glu Asp lie lie Leu Val Ser Tyr He 
15 * 20 25 

caa caa cac ggc cct gga aat tgg aga tct gtc cct gca aac acc ggt 4 46 

Gin Gin His Gly Pro Gly Asn Trp Arg Ser Val Pro Ala Asn Thr Gly 

30 35 40 45 

ttg eta agg tgt age aag agt tgc aga ctt aga tgg act aat tac ctt 4 94 

Leu Leu Arg Cys Ser Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu 
50 55 60 

cgt ccc ggg ate aaa cga gga aat ttc act caa ccg gaa gag aag atg 542 

Arg Pro Gly lie Lys Arg Gly Asn Phe Thr Gin Pro Glu Glu Lys Met 
65 70 75 



590 



tea tat eta cct cag agg acc gac aat gat ate aag aac tac tgg aac 638 
Ser Tyr Leu Pro Gin Arg Thr Asp Asn Asp He Lys Asn Tyr Trp Asn 
95 100 105 

act cat ctt aaa aag aaa eta gtg atg atg aag ttt caa aat ggt ate 686 
Thr Hi b Leu Lys Lys Lys Leu Val Met Met Lys Phe Gin Asn Gly He 
110 115 120 125 

ate aac gaa aac aaa acc aat ctg gca aca gat att teg tct tgt aat 734 
lie Asn Glu Asn Lys Thr Asn Leu Ala Thr Asp lie Ser Ser Cys Asn 
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130 



MBI16 Sequence Listing. ST25 

135 1*0 



aat aac aac aat qqa tgt aat cac aac aaa agg acc acc aac aaa ggc 
lln III III HI Sy C^s Asn His Asn Lys Arg Thr Thr Asn Lys Gly 
145 * 50 

S 5 IS S K E S S IS 55 SS SS IS a £ K 

160 165 1 

a a a a as b a a a s s a a s a a 

17S I 80 185 

e s c ss s a s s s s s s a s a a 

190 195 

a s s a a a a a a a s a a a a a 

210 215 

tea teg tea aag cct aac act tea tea gtc tec aac aac egg age tea 
sir Ser Ser Lyl Pro Asn Thr Ser Ser Val Ser Asn Asn Arg ser Ser 
225 230 

a a a a a a a a a a a a a a a a 

240 245 

aat tea gaa tct gga tea gtt gat gag aag ctg aat ttg atg tec gag 
Aro Ser GlS Lr Gly Ser Val Asp Glu Lys Leu Asn Leu Met Ser Glu 
255 260 

a a a a a a a a a a a a a a a a 

270 275 ZHU 



aca cct act act act act act act act gat gat 
Thr Pro Thr Thr Thr Thr Thr Thr Thr Asp Asp 
290 295 

ttg ate gag aaa tgg ttg ttt gat gat caa ggc 
Leu He Glu Lys Trp Leu Phe Asp Asp Gin Gly 
305 310 

gat agt caa gaa gat etc ate gac gtg tct tta 
Asp Ser Gin Glu Asp Leu lie Asp Val Ser Leu 
320 325 

tgataacaac agtcaagatt tgttctataa gaaaataaaa 
ctagctaggt ttattaattt ttctttcttt tgtcttttct 
ttattttact gtgtggcttg cttgtggtca agtcgatgaa 
tttatatgta aagtactata aagttaagag tagttgaata 



caa ggc teg ttg tea 
Gin Gly Ser Leu Ser 
300 

ttg gtt cag tgt gat 
Leu Val Gin Cys Asp 
315 

gag gag tta aaa taa 
Glu Glu Leu Lys 
330 

cgtatagaac aacgataaag 
ctatgatctt tagttacatt 
gatcaaactg tgatatacta 
aaaaaaaaaa aaaaaaa 



782 

830 

878 

926 

974 

1022 

1070 

1118 

1166 

1214 

1262 

1310 

1370 
1430 
1490 
1547 



<210> 10 

<2H> 332 

<212> PRT 

<213> Arabidopsis 

<400> 10 

Met Gly Arg Pro Pro Cys Cys Glu Lys lie Glu Val Lys Lys Gly Pro 
1 5 10 

Trp Thr Pro Glu Glu Asp He lie Leu Val Ser Tyr lie Gin Gin His 
20 25 



Gly Pro Gly Asn Trp Arg Ser Val Pro Ala Asn Thr Gly Leu Leu Arg 
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35 



MBI16 Sequence Listing. ST25 
40 45 



Cys Ser Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro Gly 



He Lys Arg Gly Asn Phe Thr Gin Pro Glu Glu Lys Met He He His 
65 " ' 70 75 80 



Leu Gin Ala Leu Leu Gly Asn Arg Trp Ala Ala He Ala Ser Tyr Leu 
85 90 95 



Pro Gin Arg Thr Asp Asn Asp He Lys Asn Tyr Trp Asn Thr His Leu 
100 105 110 



Lys Lys Lys Leu Val Met Met Lys Phe Gin Asn Gly He He Asn Glu 
115 120 125 



Asn Lys Thr Asn Leu Ala Thr Asp lie Ser Ser Cys Asn Asn Asn Asn 
130 135 ' 140 



Asn Gly Cys Asn His Asn Lys Arg Thr Thr Asn Lys Gly Gin Trp Glu 
145 150 155 160 



Lys Lys Leu Gin Thr Asp lie Asn Met Ala Lys Gin Ala Leu Phe Gin 
165 170 175 



Ala Leu Ser Leu Asp Gin Pro Ser Ser Leu He Pro Pro Asp Pro Asp 
180 185 190 



Ser Pro Lys Pro His His His Ser Thr Thr Thr Tyr Ala Ser Ser Thr 
195 200 205 



Asp Asn He Ser Lys Leu Leu Gin Asn Trp Thr Ser Ser Ser Ser Ser 
210 215 220 



Lys Pro Asn Thr Ser Ser Val Ser Asn Asn Arg Ser Ser Ser Pro Gly 
225 230 235 240 



Glu Gly Gly Leu Phe Asp His His Ser Leu Phe Ser Ser Asn Ser Glu 
245 250 255 



Ser Gly Ser Val Asp Glu Lys Leu Asn Leu Met Ser Glu Thr Ser Met 
260 265 270 



Phe Lys Gly Glu Ser Lys Pro Asp He Asp Met Glu Ala Thr Pro Thr 
275 280 285 



Thr Thr Thr Thr Thr Thr Asp Asp Gin Gly Ser Leu Ser Leu He Glu 
290 295 300 



Lys Trp Leu Phe Asp Asp Gin Gly Leu Val Gin Cys Asp Asp Ser Gin 
305 310 315 320 



Glu Asp Leu He Asp Val Ser Leu Glu Glu Leu Lys 



50 



55 



60 



325 



330 
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<210> 11 

<211> 2405 

<212> DNA . 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (381) . . (2213) 

<223> G419 



aagccacaca atctcttttc ttctctctct ctctgttata tctcttctgt ttaattcttt 
tattcttctt cgtctatctt ctcctataat etettetetc tccctcttca cctaaa g aat 
aagaagaaaa ataattcaca tctttatgca aactactttc ttgtagggtt ttaggagcta 
tetctattgt cttggttctg atacaaagtt ttgtaatttt catggtatga gaagatttgc 
ctttctattt tgtttattgg ttctttttaa ctttttcttg gagatgggtt cttgtagatc 
ttaatgaaac ttctgttttt gtcccaaaaa gagttttctt ttttcttctc ttctttttgg 
gttttcaatt cttgagagac atg gca aga gat cag ttc tat ggt cac |at aac 

a a a a es a s a a a a a a a a b 
a e a a s s a a - e a a a s; a a 

30 35 

s a sss a; is a s a a « » a « b s s 
a s ss a is s ss c s a? s a sr. s a a 

60 65 

a a a s s a? « ss e a ss a = » !!' « 

80 " 

s is ss r, a si s §i a e k a e » s K 
e e s a b a ss a s a a s s « s a 

110 115 

S S S K S S - S S S B S S5 E £ 5S 

125 130 

ss s a sss ss a. as a? a ss m ss ss ss 

140 145 

aca caa cat cag aat etc caa cac acg cag atg atg atg atg atg jjg 
Thr Gin His Gin Asn Leu Gin His Tnr Gin net ^ Q 
160 165 

ss ss ss ss ss a a a a a a s s a a a 

175 180 
cat cat cag ttt cag att ggg agt tec aag tat ttg gt.ee. get caa 
His His Gin Phe Gin He Gly Ser &er uy* y ^ 



60 
120 
180 
240 
300 
360 
413 

461 

509 

557 

605 

653 

701 



749 



797 



845 



893 



941 



989 



190 * 95 
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gag eta ctg agt gag ttt tgc agt ctt gga gta aag gaa age gat gaa 

Glu Leu Leu Ser Glu Phe Cys Ser Leu Gly Val Lys Glu Ser Asp Glu 
205 210 215 

gaa gtg atg atg atg aag cat aag aag aag caa aag ggt aaa caa caa 

Glu Val Met Met Met Lys His Lys Lys Lys Gin Lys Gly Lys Gin Gin 

220 225 230 235 

gaa gag tgg gac aca agt cac cac age aac aat gat caa cat gac caa 

Glu Glu Trp Asp Thr Ser His His Ser Asn Asn Asp Gin His Asp Gin 

240 245 250 



1037 



1085 



1133 



tct gcg act act tct tea aag aaa cat gtt cca cca ctt cac tct ctt 
Ser Ala Thr Thr Ser Ser Lys Lys His Val Pro Pro Leu His Ser Leu 
255 260 265 



1181 



gag ttc atg gaa ctt cag aaa aga aaa gee aag ttg etc tec atg etc 1229 
Glu Phe Met Glu Leu Gin Lys Arg Lys Ala Lys Leu Leu Ser Met Leu 
270 275 280 

gaa gag ctt aaa aga aga tat gga cat tac cga gag caa atg aga gtt 1277 
Glu Glu Leu Lys Arg Arg Tyr Gly His Tyr Arg Glu Gin Met Arg Val 
285 290 295 

gcg gcg gca gec ttt gaa gcg gcg gtt gga eta gga ggg gca gag ata 1325 
Ala Ala Ala Ala Phe Glu Ala Ala Val Gly Leu Gly Gly Ala Glu lie 
300 305 310 315 

tac act gcg tta gcg tea agg gca atg tea aga cac ttt egg tgt tta 1373 
Tyr Thr Ala Leu Ala Ser Arg Ala Met Ser Arg His Phe Arg Cys Leu 
320 325 330 

aaa gac gga ctt gtg gga cag att caa gca aca agt caa get ttg gga 1421 
Lys Asp Gly Leu Val Gly Gin He Gin Ala Thr Ser Gin Ala Leu Gly 
335 340 345 

gag aga gaa gag gat aat cgt gcg gtt tct att gca gca cgt gga gaa 1469 
Glu Arg Glu Glu Asp Asn Arg Ala Val Ser He Ala Ala Arg Gly Glu 
350 355 360 

act cca egg ttg aga ttg etc gat caa get ttg egg caa cag aaa teg 1517 
Thr Pro Arg Leu Arg Leu Leu Asp Gin Ala Leu Arg Gin Gin Lys Ser 
365 370 375 

tat cgc caa atg act ctt gtt gac get cat cct tgg cgt cca caa cgc 1565 
Tyr Arg Gin Met Thr Leu Val Asp Ala His Pro Trp Arg Pro Gin Arg 
380 385 390 395 

ggc ttg cct gaa cgc gca gtc aca acg ttg aga get tgg etc ttt gaa 1613 
Gly Leu Pro Glu Arg Ala Val Thr Thr Leu Arg Ala Trp Leu Phe Glu 
400 405 410 

cac ttt ctt cac cca tat ccg age gat gtt gat aag cat ata ttg gee 1661 
His Phe Leu His Pro Tyr Pro Ser Asp Val Asp Lys His He Leu Ala 
415 420 425 

cga caa act ggt tta tea aga agt cag gta tea aat tgg ttt att aat 1709 
Arg Gin Thr Gly Leu Ser Arg Ser Gin Val Ser Asn Trp Phe He Asn 
430 435 440 



gca aga gtt agg eta tgg aaa cca atg att gaa gaa atg tac tgt gaa 1757 
Ala Arg Val Arg Leu Trp Lys Pro Met He Glu Glu Met Tyr Cys Glu 
445 450 455 

gaa aca aga agt gaa caa atg gag att aca aac ccg atg atg ate gat 1805 
Glu Thr Arg Ser Glu Gin Met Glu He Thr Asn Pro Met Met He Asp 
460 465 470 475 

act aaa ccg gac ccg gac cag ttg ate cgt gtc gaa ccg gaa tct tta 1853 
Thr Lys Pro Asp Pro Asp Gin Leu He Arg Val Glu Pro Glu Ser Leu 
480 485 490 

tec tea ata gtg aca aac cct aca tec aaa tec ggt cac aac tea acc 1901 
Ser Ser He Val Thr Asn Pro Thr Ser Lys Ser Gly His Asn Ser Thr 
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495 500 505 



rat aaa acq atq teg tta ggg tea acg ttt gac ttt tec ttg tac ggt 
His Gly Thr Met Ser Leu Gly Ser Thr Phe Asp Phe Ser Leu Tyr Gly 

aac caa get gtg aca tac get ggt gaa gga ggg cca cgt ggt gac gtt 
Asn G?n Ala Val Thr Tyr Ala Gly Glu Gly Gly Pro Arg Gly Asp Val 
525 530 535 

rrr tta aca ctt qqq tta caa cgt aac gat ggt aac ggt ggt gtg agt 
tTr HI Thr Leu l?y Leu Gin Arg Asn Asp Gly Asn Gly Gly Val Ser 
540 545 550 

onn tta tct cca atq acq get caa ggt ggc caa ctt ttc tac ggt 
Leu 9 p11 Leu Ser Pro Va! Tn? Ala Gin l!y Ely Gin Leu Phe Tyr Gly 
560 565 D ' u 

oac cac att qaa qaa gga ccg gtt caa tat tea gcg teg atg tta 
Irg Asp His tie G?u l?u Gly Pro Val Gin Tyr Ser Ala Ser Met Leu 
3 575 580 58 => 

gat gat 9 at caa gtt cag aat ttg cct tat agg aat ttg atg gga get 
Asp Asp Asp Gin val Gin Asn Leu Pro Tyr Arg Asn Leu Met Gly Ala 
595 6U0 



590 



<400> 12 

Met Ala A 
1 5 



Met Ala Arg Asp Gin Phe Tyr Gly His Asn Asn His His His Gin Glu 



Gin Gin His Gin Met lie Asn Gin lie Gin Gly Phe Asp Glu Thr Asn 
20 25 30 

Gin Asn Pro Thr Asp' His His His Tyr Asn His Gin lie Phe Gly Ser 
35 40 4 5 

Asn Ser Asn Met Gly Met Met lie Asp Phe Ser Lys Gin Gin Gin lie 
50 55 60 

Arg Met Thr Ser Gly Ser Asp His His His His His His Gin Thr Ser 
65 7 0 75 

Gly Gly Thr Asp Gin Asn Gin Leu Leu Glu Asp Ser Ser Ser Ala Met 
1 85 90 95 

Arq Leu Cys Asn Val Asn Asn Asp Phe Pro Ser Glu Val Asn Asp Glu 
y 100 105 HO 

Arq Pro Pro Gin Arg Pro Ser Gin Gly Leu Ser Leu Ser Leu Ser Ser 
y 115 ~ 120 125 
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1949 



1997 



2045 



2093 



2141 



2189 



2243 



caa tta ctt cat gat att gtt tga gattaaaaga ttaggaccaa agttatcgat 
Gin Leu Leu His Asp He Val 
605 610 

acatattttc caaaaccgat tcggttatgt aacggtttag ttagataaaa accaaattag 2303 
atatttatat ataccgttgt ctgattggat tggaggattg gtggacaagg agatattatt 2363 
aatgtatgag ttagttggtt cgtcaaaaaa aaaaaaaaaa aa 

<210> 12 

<211> 610 

<212> PRT 

<213> Arabidopsis thalxana 



2405 
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Ser Asn Pro Thr Ser He Ser Leu Gin Ser Phe Glu Leu Arg Pro Gin 
130 135 140 

Gin Gin Gin Gin Gly Tyr Ser Gly Asn Lys Ser Thr Gin His Gin Asn 
145 150 155 160 

Leu Gin His Thr Gin Met Met Met Met Met Met Asn Ser His His Gin 
165 170 175 

Asn Asn Asn Asn Asn Asn His Gin His His Asn His His Gin Phe Gin 
180 185 190 

He Gly Ser Ser Lys Tyr Leu Ser Pro Ala Gin Glu Leu Leu Ser Glu 
195 200 205 

Phe Cys Ser Leu Gly Val Lys Glu Ser Asp Glu Glu Val Met Met Met 
210 215 ' 220 

Lys His Lys Lys Lys Gin Lys Gly Lys Gin Gin Glu Glu Trp Asp Thr 
225 230 235 240 

Ser His His Ser Asn Asn Asp Gin His Asp Gin Ser Ala Thr Thr Ser 
245 250 255 

Ser Lys Lys His Val Pro Pro Leu His Ser Leu Glu Phe Met Glu Leu 
260 265 270 

Gin Lys Arg Lys Ala Lys Leu Leu Ser Met Leu Glu Glu Leu Lys Arg 
275 280 285 

Arg Tyr Gly His Tyr Arg Glu Gin Met Arg Val Ala Ala Ala Ala Phe 
290 295 300 

Glu Ala Ala Val Gly Leu Gly Gly Ala Glu He Tyr Thr Ala Leu Ala 
305 310 315 320 

Ser Arg Ala Met Ser Arg His Phe Arg Cys Leu Lys Asp Gly Leu Val 
325 330 335 

Gly Gin He Gin Ala Thr Ser Gin Ala Leu Gly Glu Arg Glu Glu Asp 
340 345 350 

Asn Arg Ala Val Ser He Ala Ala Arg Gly Glu Thr Pro Arg Leu Arg 
355 360 365 

Leu Leu Asp Gin Ala Leu Arg Gin Gin Lys Ser Tyr Arg Gin Met Thr 
370 375 380 

Leu Val Asp Ala His Pro Trp Arg Pro Gin Arg Gly Leu Pro Glu Arg 
385 390 395 400 

Ala Val Thr Thr Leu Arg Ala Trp Leu Phe Glu His Phe Leu His Pro 
405 410 415 



Tyr Pro Ser Asp Val Asp Lys His He Leu Ala Arg Gin Thr Gly Leu 
420 425 430 
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MBI16 Sequence Listing. ST25 

Ser Arg Ser Gin Val Ser Asn Trp Phe He Asn Ala Arg Val Arg Leu 
435 440 445 

Trp Lys Pro Met He Glu Glu Met Tyr Cys Glu Glu Thr Arg Ser Glu 
450 455 460 

Gin Met Glu He Thr Asn Pro Met Met He Asp Thr Lys Pro Asp Pro 
465 470 475 480 

Asp Gin Leu He Arg Val Glu Pro Glu Ser Leu Ser Ser He Val Thr 
485 490 495 

Asn Pro Thr Ser Lys Ser Gly His Asn Ser Thr His Gly Thr Met Ser 
500 505 510 

Leu Gly Ser Thr Phe Asp Phe Ser Leu Tyr Gly Asn Gin Ala Val Thr 
515 520 525 

Tyr Ala Gly Glu Gly Gly Pro Arg Gly Asp Val Ser Leu Thr Leu Gly 
530 535 540 

Leu Gin Arg Asn Asp Gly Asn Gly Gly Val Ser Leu Ala Leu Ser Pro 
545 550 555 560 

Val Thr Ala Gin Gly Gly Gin Leu Phe Tyr Gly Arg Asp His He Glu 
565 570 575 

Glu Gly Pro Val Gin Tyr Ser Ala Ser Met Leu Asp Asp Asp Gin Val 
580 585 590 

Gin Asn Leu Pro Tyr Arg Asn Leu Met Gly Ala Gin Leu Leu His Asp 
595 " 600 605 

He Val 
610 

<210> 13 

<211> 989 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (41).. (664) 

<223> G464 

<400> 13 

ctctgctggt atcattggag tctagggttt tgttattgac atg cgt ggt gtg tea 55 

Met Arg Gly Val Ser 
1 5 



gaa ttg gag gtg ggg aag agt aat ctt ccg gcg gag agt gag ctg gaa 
Glu Leu Glu Val Gly Lys Ser Asn Leu Pro Ala Glu Ser Glu Leu Glu 
10 15 20 



103 



ttg gga tta ggg etc age etc ggt ggt ggc gcg tgg aaa gag cgt ggg 151 
Leu Gly Leu Gly Leu Ser Leu Gly Gly Gly Ala Trp Lys Glu Arg Gly 
25 30 35 

agg att ctt act get aag gat ttt cct tec gtt ggg tct aaa cgc tct 199 
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MBI16 Sequence Listing. ST25 
Arg lie Leu Thr Ala Lys Asp Phe Pro Ser Val Gly Ser Lys Arg Ser 
40 45 50 

get gaa tct tec tct cac caa gga get tct cct cct cgt tea agt caa 247 
Ala Glu Ser Ser Ser His Gin Gly Ala Ser Pro Pro Arg Ser Ser Gin 
55 60 65 



gtg gta gga tgg cca cca att ggg tta cac agg atg aac agt ttg gtt 
Val Val Gly Trp Pro Pro He Gly Leu His Arg Met Asn Ser Leu Val 
70 * 75 80 85 



295 



aat aac caa get atg aag gca gca aga gcg gaa gaa gga gac ggg gag 34 3 

Asn Asn Gin Ala Met Lys Ala Ala Arg Ala Glu Glu Gly Asp Gly Glu 
90 95 100 

aag aaa gtt gtg aag aat ggt gag etc aaa gat gtg tea atg aag gtg 391 
Lys Lys Val Val Lys Asn Gly Glu Leu Lys Asp Val Ser Met Lys Val 
105 110 115 

aat ccg aaa gtt cag ggc tta ggg ttt gtt aag gtg aat atg gat gga 439 
Asn Pro Lys Val Gin Gly Leu Gly Phe Val Lys Val Asn Met Asp Gly 
120 125 130 

gtt ggt ata ggc aga aaa gtg gat atg aga get cat teg tct tac gaa 487 
Val Gly He Gly Arg Lys Val Asp Met Arg Ala His Ser Ser Tyr Glu 
135 140 . 145 

aac ttg get cag acg ctt gag gaa atg ttc ttt gga atg aca ggt act 535 
Asn Leu Ala Gin Thr Leu Glu Glu Met Phe Phe Gly Met Thr Gly Thr 
150 155 160 165 

act tgt cga gaa acg gtt aaa cct tta agg ctt tta gat gga tea tea 583 
Thr Cys Arg Glu Thr Val Lys Pro Leu Arg Leu Leu Asp Gly Ser Ser 
170 175 180 

gac ttt gta etc act tat gaa gat aag ggg att gga tgc ttg ttg gag 631 
Asp Phe Val Leu Thr Tyr Glu Asp Lys Gly He Gly Cys Leu Leu Glu 
185 190 195 

atg ttc cat gga gaa tgt tta tea act egg tga aaaggcttcg gatcatggga 684 
Met Phe His Gly Glu Cys Leu Ser Thr Arg 
200 205 

acctcagaag ctagtggact agctccaaga egtcaagage agaaggatag acaaagaaac 744 

aaccctgttt agcttccctt ccaaagctgg cattgtttat gtattgtttg aggtttgcaa 804 

tttactcgat actttttgaa gaaagtattt tggagaatat ggataaaagc atgeagaage 864 

ttagatatga tttgaatccg gtttteggat atggttttgc ttaggtcatt caattegtag 924 

ttttecagtt tgtttcttct ttggctgtgt accaattatc tatgttctgt gagagaaagc 984 

tcttg 

<210> 14 
<211> 207 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 14 

Met Arg Gly Val Ser Glu Leu Glu Val Gly Lys Ser Asn Leu Pro Ala 
15 10 15 

Glu Ser Glu Leu Glu Leu Gly Leu Gly Leu Ser Leu Gly Gly Gly Ala 
20 25 30 

Trp Lys Glu Arg Gly Arg He Leu Thr Ala Lys Asp Phe Pro Ser Val 
35 40 45 



989 
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MBI16 Sequence Listing.ST25 
Gly Ser Lys Arg Ser Ala Glu Ser Ser Ser His Gin Gly Ala Ser Pro 
50 55 60 

Pro Arq Ser Ser Gin Val Val Gly Trp Pro Pro lie Gly Leu His Arg 
65 70 75 80 

Met Asn Ser Leu Val Asn Asn Gin Ala Met Lys Ala Ala Arg Ala Glu 
85 9° 95 

Glu Gly Asp Gly Glu Lys Lys Val Val Lys Asn Gly Glu Leu Lys Asp 
100 105 11° 

Val Ser Met Lys Val Asn Pro Lys Val Gin Gly Leu Gly Phe Val Lys 
115 120 125 

Val Asn Met Asp Gly Val Gly He Gly Arg Lys Val Asp Met Arg Ala 
130 135 140 

His Ser Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu Glu Met Phe Phe 
145 150 155 160 

Gly Met Thr Gly Thr Thr Cys Arg Glu Thr Val Lys Pro Leu Arg Leu 
165 170 175 

Leu Asp Gly Ser Ser Asp Phe Val Leu Thr Tyr Glu Asp Lys Gly He 
180 I 85 190 

Gly Cys Leu Leu Glu Met Phe His Gly Glu Cys Leu Ser Thr Arg 
195 200 205 



<210> 


15 


<211> 


1065 


<212> 


DNA 


<213> 


Arabidopsis thai i ana 


<220> 




<221> 


CDS 


<222> 


(188).. (760) 


<223> 


G482 


<400> 


15 



tcgacccacg cgcccggaca t v. ta <»v-»ct cacaccttct ctttttactc ttcctaaaac 60 
cctaaatttc ctcgcttcag tcttcccact caagtcaacc accaatcgaa ttcgatttcg 120 
aatcattgat ggaaatgatt tgaaaaaaga gtaaagttta tttttttact ccttgtaatt 



180 



ttcagaa 


atg 
Met 
1 


ggg 

Gly 


gat 
Asp 


tec 
Ser 


gac 
Asp 
5 


agg 
Arg 


gat 
Asp 


tec 
Ser 


ggt gga 
Gly Gly 
10 


ggg 

Gly 


caa 
Gin 


aac 
Asn 


ggg 

Gly 


229 


aac aac 
Asn Asn 
15 


cag 
Gin 


aac 
Asn 


gga 

Gly 


cag 
Gin 
20 


tec 
Ser 


tec 
Ser 


ttg 
Leu 


tct 
Ser 


cca aga 
Pro Arg 
25 


gag 

Glu 


caa 
Gin 


gac 
Asp 


agg 
Arg 
30 


277 


ttc ttg 
Phe Leu 


ccg 
Pro 


ate 
He 


get 
Ala 
35 


aac 
Asn 


gtc 
Val 


age 
Ser 


egg 
Arg 


ate 
He 
40 


atg aag 
Met Lys 


aag 
Lys 


gcc 
Ala 


ttg 
Leu 
45 


ecc 
pro 


325 


gcc aac 
Ala Asn 


gcc 
Ala 


aag 
Lys 
50 


ate 
He 


tct 
Ser 


aaa 
Lys 


gat 
Asp 


gcc 
Ala 
55 


aaa 
Lys 


gag acg 
Glu Thr 


atg 
Met 


cag 
Gin 
60 


gag 
Glu 


tgt 

Cys 


373 


gtc tec 


gag 


ttc 


ate 


age 


ttc 


gtc 


ace 


gga 


gaa gca 


tct 


gat 


aag 


tgt 


421 
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MBI16 Sequence Listing. ST25 
Val Ser Glu Phe He Ser Phe Val Thr Gly Glu Ala Ser Asp Lys Cys 
65 70 75 

cag aag gag aag agg aag acg ate aac gga gac gat ttg etc tgg get 469 
Gin Lys Glu Lys Arg Lys Thr He Asn Gly Asp Asp Leu Leu Trp Ala 
80 " 85 90 

atg act act eta ggt ttt gag gat tat gtt gag cca ttg aaa gtt tac 517 
Met Thr Thr Leu Gly Phe Glu Asp Tyr Val Glu Pro Leu Lys Val Tyr 
95 100 105 110 

ttg cag agg ttt agg gag ate gaa ggg gag agg act gga eta ggg agg 565 
Leu Gin Arg Phe Arg Glu He Glu Gly Glu Arg Thr Gly Leu Gly Arg 
115 120 125 

cca cag act ggt ggt gag gtc gga gag cat cag aga gat get gtc gga 613 
Pro Gin Thr Gly Gly Glu Val Gly Glu His Gin Arg Asp Ala Val Gly 
130 135 140 

gat ggc ggt ggg ttc tac ggt ggt ggt ggt ggg atg cag tat cac caa -661 
Asp Gly Gly Gly Phe Tyr Gly Gly Gly Gly Gly Met Gin Tyr His Gin 
145 150 155 



709 



cat cat cag ttt ctt cac cag cag aac cat atg tat gga gee aca ggt 
His His Gin Phe Leu His Gin Gin Asn His Met Tyr Gly Ala Thr Gly 
160 165 170 

ggc ggt age gac agt gga ggt gga get gec tec ggt agg aca agg act 757 
Gly Gly Ser Asp Ser Gly Gly Gly Ala Ala Ser Gly Arg Thr Arg Thr 
175 180 185 190 

taa caaagattgg tgaagtggat ctctctctgt atatagatac ataaatacat 810 

gtatacacat gectattttt acgacccata taaggtatct atcatgtgat agaacgaaca 870 

ttggtgttgg tgatgtaaaa tcagatgtgc attaagggtt tagattttga ggctgtgtaa 930 

aagaagatca agtgtgcttt gttggacaat aggattcact aacgaatctg cttcattgga 990 

tcttgtatgt aactaaagee attgtattga atgcaaatgt tttcatttgg gatgetttaa 1050 

aaaaaaaaaa aaaaa 1065 

<210> 16 

<211> 190 

<212> PRT 

<213> ArabidopsiB thaliana 

<400> 16 

Met Gly Asp Ser Asp Arg Asp Ser Gly Gly Gly Gin Asn Gly Asn Asn 
1 5 10 15 

Gin Asn Gly Gin Ser Ser Leu Ser Pro Arg Glu Gin Asp Arg Phe Leu 
20 25 30 

Pro He Ala Asn Val Ser Arg He Met Lys Lys Ala Leu Pro Ala Asn 
35 40 45 

Ala Lys He Ser Lys Asp Ala Lys Glu Thr Met Gin Glu Cys Val Ser 
50 55 60 

Glu Phe He Ser Phe Val Thr Gly Glu Ala Ser Asp Lys Cys Gin Lys 
65 70 75 80 

Glu Lys Arg Lys Thr lie Asn Gly Asp Asp Leu Leu Trp Ala Met Thr 
85 90 95 
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MBI16 Sequence Listing -ST2 5 
Thr Leu Gly Phe Glu Asp Tyr Val Glu Pro Leu Lys Val Tyr Leu Gin 
100 105 HO 

Arq Phe Arg Glu He Glu Gly Glu Arg Thr Gly Leu Gly Arg Pro Gin 
3 us 120 125 

Thr Gly Gly Glu Val Gly Glu His Gin Arg Asp Ala Val Gly Asp Gly 
130 135 140 

Gly Gly Phe Tyr Gly Gly Gly Gly Gly Met Gin Tyr His Gin His His 
145 * 150 155 160 

Gin Phe Leu His Gin Gin Asn His Met Tyr Gly Ala Thr Gly Gly Gly 
165 170 175 

Ser Asp Ser Gly Gly Gly Ala Ala Ser Gly Arg Thr Arg Thr 
180 185 190 



<210> 


17 


<211> 


1409 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(224) . . (1093) 


<223> 


G502 


<400> 


17 



1 



tta cag ttg cct cca ggt ttc cga ttt cac cct acc gat gaa gag ctt 
Leu Gin Leu Pro Pro Gly Phe Arg Phe His Pro Thr Asp Glu Glu Leu 



5 10 



60 
120 
180 



tttcatttgg agaggacacg ctgacaagct gactctagca gatctgggac cgtcgaccca 

cgcgtccgaa ttgattagga taggatcagg atcatcctca acaacctcct cctaattcct 

cctccattca tagtaacaat aatattaaga aagagggtaa act atg tea gaa tta 235 

Met Ser Glu Leu 



283 



331 



gtc atg cac tat etc tgc cgc aaa tgt gec tct cag tec ate gee gtt 
Val Met His Tyr Leu Cys Arg Lys Cys Ala Ser Gin Ser He Ala Val 
25 30 35 

ccg ate ate get gag ate gat etc tac aaa tac gat cca tgg gag ctt 379 



Pro He He Ala Glu He Asp Leu Tyr Lys Tyr Asp Pro Trp Glu Leu 
40 45 50 

cct ggt tta gee ttg tat ggt gag aag gaa tgg tac ttc ttc tct ccc 
Pro Gly Leu Ala Leu Tyr Gly Glu Lys Glu Trp Tyr Phe Phe Ser Pro 
55 60 65 

agg gac aga aaa tat ccc aac ggt teg cgt cct aac egg tec get ggt 
Arg Asp Arg Lys Tyr Pro Asn Gly Ser Arg Pro Asn Arg Ser Ala Gly 
70 75 80 

tct ggt tac tgg aaa get acc gga get gat aaa ccg ate gga eta cct 523 
Ser Gly Tyr Trp Lys Ala Thr Gly Ala Asp Lys Pro He Gly Leu Pro 
85 * 90 95 100 



427 



475 



571 



aaa ccg gtc gga att aag aaa get ctt gtt ttc tac gee ggc aaa get 
Lvs Pro Val Gly He Lys Lys Ala Leu Val Phe Tyr Ala Gly Lys Ala 
7 105 HO H5 

cca aag gga gag aaa acc aat tgg ate atg cac gag tac cgt etc gee 619 
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MBI16 Sequence Listing. ST25 
Pro Lys Gly Glu Lys Thr Asn Trp lie Met His Glu Tyr Arg Leu Ala 
120 125 130 

gac gtt gac egg tec gtt cgc aag aag aag aat agt etc agg ctg gat 
Asp Val Asp Arg Ser Val Arg Lys Lys Lys Asn Ser Leu Arg Leu Asp 
135 140 145 



egg gge cca ccg cct ccg gtt gtt tac ggc gac gaa ate atg gag gag 
Arg Gly Pro Pro Pro Pro Val Val Tyr Gly Asp Glu lie Met Glu Glu 
165 170 175 180 



aat gac aat aac aat acc ctt gat ttt ggg ttt aat tac att gat gec 
Asn Asp Asn Asn Asn Thr Leu Asp Phe Gly Phe Asn Tyr He Asp Ala 
245 250 255 260 



ccg eta cag gat atg ttc atg tac atg cag aag cct tac tag 
Pro Leu Gin Asp Met Phe Met Tyr Met Gin Lys Pro Tyr 
280 285 



<210> 18 
<211> 289 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 18 

Met Ser Glu Leu Leu Gin Leu Pro Pro Gly Phe Arg Phe His Pro Thr 
15 10 15 

Asp Glu Glu Leu Val Met His Tyr Leu Cys Arg Lys Cys Ala Ser Gin 
20 25 30 

Ser He Ala Val Pro He He Ala Glu He Asp Leu Tyr Lys Tyr Asp 
35 40 45 



667 



gat tgg gtt etc tgc egg att tac aac aaa aaa gga get acc gag agg 715 
Asp Trp Val Leu Cys Arg He Tyr Asn Lys Lys Gly Ala Thr Glu Arg 
150 * 155 160 



763 



aag ccg aag gtg acg gag atg gtt atg cct ccg ccg ccg caa cag aca 811 

Lys Pro Lys Val Thr Glu Met Val Met Pro Pro Pro Pro Gin Gin Thr 
185 190 195 

agt gag ttc gcg tat ttc gac acg teg gat teg gtg ccg aag ctg cat 859 

Ser Glu Phe Ala Tyr Phe Asp Thr Ser Asp Ser Val Pro Lys Leu His 
200 205 210 

act acg gat teg agt tgc teg gag cag gtg gtg teg ccg gag ttc acg 907 

Thr Thr Asp Ser Ser Cys Ser Glu Gin Val Val Ser Pro Glu Phe Thr 
215 220 . 225 

age gag gtt cag age gag ccc aag tgg aaa gat tgg teg gee gta agt 955 

Ser Glu Val Gin Ser Glu Pro Lys Trp Lys Asp Trp Ser Ala Val Ser 
230 235 240 



1003 



acc gtg gat aac gcg ttt gga gga gga ggg agt agt aat cag atg ttt 1051 
Thr Val Asp Asn Ala Phe Gly Gly Gly Gly Ser Ser Asn Gin Met Phe 
265 270 275 



1093 



aagggaattc 


ctttcctgcc gccgaaacgc aacgeaaaac 


gaccctcgtt 


tttgcgttta 


1153 


tggcaacacg 


agaccgtttt atatggtcaa tgagtgtgcc 


gatteggeca 


ttagatttct 


1213 


gttcagtctt 


cgtttattct atagacegtc cgatttcaga 


tcatccctaa 


teggaeggtg 


1273 


gtcgttggat 


gtatcagtag tgtattactg tgttaggtag 


aagaaaatcc 


acttgttctt 


1333 


aaattggcat 


aaaagtcaga agctaatatt tatatgtgcc 


gcaatcaatt 


taatattttc 


1393 


tgtctaaaaa 


aaaaaa 






1409 
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MBI16 Sequence Listing.ST25 
Pro Trp GXu Leu Pro Oly Leu Leu Tyr 01, g . «« «P 
50 bb 



„ « « - « - ^ - r ■» s " - f ™ - 
„ s„ ». « « ■» « - a* *- 01y w s* Pro 

„ « » Mf » « ■» a WS M " "* S "" 

„. «, w . v. « g . «. - - » a - - G1 " 

115 

„ „ M «. « v,i 3 «. «. « « » - - - s " 

130 13b 

«, uu »p t- ... - «• « a: ** - Ly * ty= a 

145 

Ala « « « « -» ~ ~ "° K " l " a *" " S 

165 

n . „, e 01 „ «. ». » «• « 5i « - S - ~ 

180 

«. « « - « » «; v - "> - S " P s " ,al 

195 

Ser Cys Ser Glu Gin Val Val Ser 



Pro Gin um "u Q 

195 

Pro Lys Leu His Thr Thr Asp Ser Ser cys ~- 
210 

„„ «. - * « « ~ « s » s Ly * TIP by ' '« 

225 2iU 

s „ Ala v.! S« Asn A.p - » - * <- ~ « «» 2 

w „. «, «. * V! ~ - »? - -» -» G " 8 ~ 

A.„ «. » » » - - S - - - ** all ^ *" 
275 

Tyr 



<210> 19 

<211> 1481 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (181).. (H88) 

<223> G526 
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MBI16 Sequence Listing. ST25 

<400> 19 

cgacccacgc gtccgagact ctctcccagc tagctttctc aattcatttt tctttctcca 60 

tcttcttctt gtgtgatctc tctttccaaa taagcttatc attcttacaa aaatatttct 120 

gggtttctga tattgttctt gttctcttga atctttatta cttgaaaaac atataaagtg 180 

atg gcg gtt gtg gtt gaa gaa ggt gtg gtg ttg aat cat gga ggt gaa 228 
Met Ala Val Val Val Glu Glu Gly Val Val Leu Asn His Gly Gly Glu 
1 5 10 15 

gag ctt gtg gat ttg cca cct ggt ttc agg ttt cat cca aca gac gaa 276 
Glu Leu Val Asp Leu Pro Pro Gly Phe Arg Phe His Pro Thr Asp Glu 
20 25 30 

gag ate ata aca tgt tac ctt aag gag aag gtt tta aac age cga ttc 324 
Glu lie lie Thr Cys Tyr Leu Lys Glu Lys Val Leu Asn Ser Arg Phe 
35 40 45 

acg get gtg gec atg gga gaa get gat etc aac aag tgt gag cct tgg 372 
Thr Ala Val Ala Met Gly Glu Ala Asp Leu Asn Lys Cys Glu Pro Trp 
50 55 60 

gat ttg cca aag agg gca aag atg ggg gag aaa gag ttc tac ttc ttc 420 
Asp Leu Pro Lys Arg Ala Lys Met Gly Glu Lys Glu Phe Tyr Phe Phe 
65 70 * 75 80 

tgt caa agg gac agg aag tat ccg act ggg atg agg acg aac cgt gcg 4 68 

Cys Gin Arg Asp Arg Lys Tyr Pro Thr Gly Met Arg Thr Asn Arg Ala 
85 90 95 

acg gag tea gga tac tgg aaa gec ace ggg aag gat aag gag ate ttc 516 
Thr Glu Ser Gly Tyr Trp Lys Ala Thr Gly Lys Asp Lys Glu lie Phe 
100 105 110 

aaa ggc aaa ggt tgt etc gtt ggg atg aag aaa aca ctt gtg ttt tat 564 
Lys Gly Lys Gly Cys Leu Val Gly Met Lys Lys Thr Leu Val Phe Tyr 
115 120 125 

a 9* 993 aga get cca aaa ggt gaa aag act aat tgg gtc atg cat gaa 612 
Arg . Gly Arg Ala Pro Lys Gly Glu Lys Thr Asn Trp Val Met His Glu 
130 135 140 

tat cgt ctt gaa ggc aaa tat teg tat tac aat etc cca aaa tct gca 660 
Tyr Arg Leu Glu Gly Lys Tyr Ser Tyr Tyr Asn Leu Pro Lys Ser Ala 
145 150 155 160 

agg gac gaa tgg gtc gtg tgt agg gtt ttt cac aag aac aat cct tct 708 
Arg Asp Glu Trp Val Val Cys Arg Val Phe His Lys Asn Asn Pro Ser 
165 170 175 

ace aca acc caa cca atg acg aga ata ccc gtt gaa gat ttc aca agg 756 
Thr Thr Thr Gin Pro Met Thr Arg lie Pro Val Glu Asp Phe Thr Arg 
180 185 190 

atg gat tct eta gag aac att gat cat etc eta gac ttc tea tct ctt 804 
Met Asp Ser Leu Glu Asn lie Asp His Leu Leu Asp Phe Ser Ser Leu 
195 200 205 

cct cct etc ata gac ccg agt ttc atg agt caa acc gaa caa cca aac 852 
Pro Pro Leu lie Asp Pro Ser Phe Met Ser Gin Thr Glu Gin Pro Asn 
210 215 220 

ttc aaa ccc ate aac cct cca act tac gat ate tea tea cca ate caa 900 
Phe Lys Pro lie Asn Pro Pro Thr Tyr Asp He Ser Ser Pro He Gin 
225 230 235 240 

ccc cat cat ttc aat tct tac caa tea ate ttt aac cac cag gtt ttt 948 
Pro His His Phe Asn Ser Tyr Gin Ser He Phe Asn His Gin Val Phe 
245 250 255 

ggt tct get teg ggc tct acg tac aac aac aac aac gag atg ate aag 996 
Gly Ser Ala Ser Gly Ser Thr Tyr Asn Asn Asn Asn Glu Met He Lys 
260 265 270 
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at9 gag caa tea ett gtt |9 t gt. tct caa 9? a g gc eta ^ ^ 
Met Glu. Gin Ser Leu Val ser va± 2Q5 
275- 280 

IS » SS E K "« S K S S K « - - « ™ 

290 295 

ss » s s e a « a s? 85 i " ? * s 3 

305 310 

g S 15 IS S 35 C B SI | K S S ~ 

325 



PC17US00/31458 



1044 
1092 
1140 
1188 



-W... ...9.9««. «••«— «— ' " 

— — """"" "" C *" 9 ' 

9»«- «— « "• 9tS ' Mt t "" > * 9 ■ 
., 9 „, M « W «« «-««■ «— « t9tM * Ct9t 
„«t,l«cc ,».....« »»»•»••«• '••»""* ""*"""* 



<210> 20 
<211> 335 
<212> PRT 

<213> Arabidopsis thaliana 

rila'val Val Val Glu Clu GXy Val Val - Asn His Gly GXy GXu 

1 5 

CX« Leu VaX Asp Leu Pro Pro GXy Phe Ar 9 Phe His Pro Thr Asp Glu 
20 2 * 

CXu XXe XXe Thr Cys Tyr Leu Lys Glu Lys VaX Leu Asn Ser Arg Phe 
35 40 

Thr AXa vaX AXa Met GXy GXu AXa Asp Leu Asn Lys Cys GXu Pro Trp 
50 55 

ASP Leu Pro Lys Ar 9 AXa Lys Met GXy GXu Lys GXu Phe Tyr Phe Phe 
65 70 

Cys GXn Ar 9 Asp Ar 9 Lys Tyr Pro Thr GXy Met Ar 9 Thr Asn A |9 AXa 
8 5 

to GXu ser GXy Tyr Trp Lys AXa Thr GXy Lys Asp Lys GXu lie Phe 
100 1Ub 

Ly8 GXy Lys GXy Cys Leu VaX GXy Met Lys Lys Thr Leu VaX Phe Tyr 
115 120 

„ «. «U » £ «• >- * - IS 

130 135 

^ ^ Leu GXu GXy Lys Tyr Ser Tyr Tyr Asn Leu Pro Lys Ser Al. 
145 150 

„ Asp GXu Trp Val VaX Cys Ar. VaX Phe His Lys Asn Asn Pro Ser 
y Page 23 
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165 170 175 

Thr Thr Thr Gin Pro Met Thr Arg He Pro Val Glu Asp Phe Thr Arg 
180 185 190 

Met Asp Ser Leu Glu Asn He Asp His Leu Leu Asp Phe Ser Ser Leu 
195 200 205 

Pro Pro Leu lie Asp Pro Ser Phe Met Ser Gin Thr Glu Gin Pro Asn 
210 215 220 

Phe Lys Pro He Asn Pro Pro Thr Tyr Asp He Ser Ser Pro He Gin 
225 230 235 240 

Pro His His Phe Asn Ser Tyr Gin Ser He Phe Asn His Gin Val Phe 
245 250 255 

Gly Ser Ala Ser Gly Ser Thr Tyr Asn Asn Asn Asn Glu Met He Lys 
260 265 270 

Met Glu Gin Ser Leu Val Ser Val Ser Gin Glu Thr Cys Leu Ser Ser 
275 280 285 

Asp Val Asn Ala Asn Met Thr Thr Thr Thr Glu Val Ser Ser Gly Pro 
290 295 300 

Val Met Lys Gin Glu Met Gly Met Met Gly Met Val Asn Gly Ser Lys 
305 310 315 320 

Ser Tyr Glu Asp Leu Cys Asp Leu Arg Gly Asp Leu Trp Asp Phe 
325 330 335 

<210> 21 
<211> 890 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (55).. (738) 

<223> G545 

<400> 21 

gcaaccttca aactaaaact cgagagacaa gaaatcctca gaatctttaa ctta atg 57 

1 

gcg etc gag get ctt aca tea cca aga tta get tct ccg att cct cct 105 
Ala Leu Glu Ala Leu Thr Ser Pro Arg Leu Ala Ser Pro He Pro Pro 
5 10 15 

ttg ttc gaa gat tct tea gtc ttc cat gga gtc gag cac tgg aca aag 153 
Leu Phe Glu Asp Ser Ser Val Phe His Gly Val Glu His Trp Thr Lys 
20 25 30 

ggt aag cga tct aag aga tea aga tec gat ttc cac cac caa aac etc 
Gly Lys Arg Ser Lys Arg Ser Arg Ser Asp Phe His His Gin Asn Leu 
35 40 45 



201 



act gag gaa gag tat eta get ttt tgc etc atg ctt etc get cgc gac 249 
Thr Glu Glu Glu Tyr Leu Ala Phe Cys Leu Met Leu Leu Ala Arg Asp 
50 55 60 65 
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MBI16 Sequence L,iscxay^*-- 

s s s s s s £ » S O S3 51 a E y J 

s s e f b w a ss s s s? a e s K « 
ss a e s s s = a s a s 5 s * w 

100 1U:> 

k s e si s s | a s = = i B K ?i! s 

115 120 

s e s as s s sr. is s s a s s b s s. 

130 135 

s S 85 S E - « « a I s s - $ E if v 

150 iDD 

s s: s s = s s si? f a ss s sr. e is k w 

165 XfyJ 

s s s ss & s ss s sr. s a? ~ e 2 £ e 

180 185 

S S| S S K 55 S 53 «S E K g 5 E « K 

ii? s s s b si s s a » 5 s s " ? "*• a 

210 215 

caa ctt t» ggaaatttac ttagacgata agatttcgtt tgtataccgt 
Gin Leu 

tga gag tt9t gtaggaattt gttgactgta cataccaaat tggactttga ctga.cccaa 
ttcttcttgt teettattt taaaaattat taaaccgatt ctttaccaca aa 

<210> 22 
<211> 227 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 22 

Met A Xa Leu GXu Ala Leu Thr Ser Pro Arg Leu Ala Ser Pro XXe Pro 

1 5 

Pro Leu Phe GXu Asp Ser Ser Val Phe His GXy VaX GXu His Trp Thr 
20 

Ly s GXy Lys Arg Ser Lys Arg Ser Arg Ser Asp Phe His His Gin Asn 
35 4U 

, ov tp„ aia Phe Cvs Leu Met Leu Leu Ala Arg 
Leu Thr Glu Glu Glu Tyr Leu Ala Phe cys ^ 

50 55 

ASP Asn Arg Gin Pro Pro Pro Pro Pro AXa vaX GXu Lys Leu Ser Tyr 
65 



345 



393 



441 



489 



537 



585 



633 



681 



729 



778 



838 
890 
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Lys Cys Ser Val Cys Asp Lys Thr Phe Ser Ser Tyr Gin Ala Leu Gly 
85 90 95 

i 

Gly His Lys Ala Ser His Arg Lys- Asn Leu Ser Gin Thr Leu Ser Gly 
100 105 HO 

Gly Gly Asp Asp His Ser Thr Ser Ser Ala Thr Thr Thr Ser Ala Val 
115 120 125 

Thr Thr Gly Ser Gly Lys Ser His Val Cys Thr He Cys Asn Lys Ser 
130 135 140 

Phe Pro Ser Gly Gin Ala Leu Gly Gly His Lys Arg Cys His Tyr Glu 
145 * 150 155 160 

Gly Asn Asn Asn He Asn Thr Ser Ser Val Ser Asn Ser Glu Gly Ala 
165 170 175 

Gly Ser Thr Ser His Val Ser Ser Ser His Arg Gly Phe Asp Leu Asn 
180 185 190 

He Pro Pro He Pro Glu Phe Ser Met Val Asn Gly Asp Asp Glu Val 
195 200 205 

Met Ser Pro Met Pro Ala Lys Lys Pro Arg Phe Asp Phe Pro Val Lys 
210 215 220 



Leu Gin Leu 
225 



<210> 


23 


<211> 


1413 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(86) . . (1168) 


<223> 


G561 


<400> 


23 



act aac aac tct gat aag cca teg caa get get get cct gag cag agt 
Thr Asn Asn Ser Asp Lys Pro Ser Gin Ala Ala Ala Pro Glu Gin Ser 
10 15 20 25 



ggt cat get cca ccg cct tat atg tgg gcg tct cca teg cca atg atg 
Gly His Ala Pro Pro Pro Tyr Met Trp Ala Ser Pro Ser Pro Met Met 
60 65 70 



60 



aatttgtttt tttttctttt gtgggttcaa ttcgaattgt tttccctgag actcaagtta 

ctgtgtcatt actctgeatt gagca atg ggt age aac gaa gaa gga aac ccc 112 

Met Gly Ser Asn Glu Glu Gly Asn Pro 

i 5 



160 



aat gtt cat gtg tat cat cat gac tgg get get atg cag gca tat tat 208 

Asn Val His Val Tyr His His Asp Trp Ala Ala Met Gin Ala Tyr Tyr 
30 35 40 

ggg cct aga gtt ggt ata cct caa tat tac aac tea aat ttg gcg cct 256 

Gly Pro Arg Val Gly He Pro Gin Tyr Tyr Asn Ser Asn Leu Ala Pro 
45 50 55 



304 
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art eet tat qqa gca cca tat cca cca ttt tgc cct cct ggt gga gtt 
Ala Pro Tyr 01 y Ala Pro Tyr Pro Pro Phe Cys Pro Pro Gly Oly Val 



80 85 

ss sss st: sss as e its s is? s s a s a; s e 

S S S E S 55 K SSS S 52 H S S K B S 
151 SSS IS E S5 K IS ft S W S » B $ 3 S 

125 130 

its sss s a; a Efc sss a sss sss = i e a? ss 

140 145 

s a: st: sis s? ss its st: s s sss ss a sss as s 



k as ss is sss is s s: a; a s s a? is e s 
s si a sss s a s s s i s k: s « £ s 

190 19b 

sis ss? ss: sts sss ss: ss; ss: ss: s as ss it: is s ss: 

205 210 21b 

e o s sss a? ss sss ss: s ss: s e a s ss: sts 

220 ^ * 

sss ss nr. as s sss as a ss ss sss as sss as sss ss 

240 ^ 4b 



235 



352 



400 



448 



496 



544 



592 



640 



688 



736 



784 



832 



880 



sss ss sis its a ss sss ss sss sss ss as ss ss: ss ss 

250 255 260 

sss ss sss ss sss ss in its ts its sts sss ss sss sss sss 

270 275 

is: sss sss sss ss: is sss sss ss sss ss ss "s sss its is 

285 290 

eta aac aat gag tec gag aaa eta egg ctg gag aac gaa get ata ttg 
Leu Asn Asn Glu Ser Glu Lys Leu Arg Leu Glu Asn Glu Ala He Leu 
300 305 

is: its sss sss H its sts tss a? sss S IS sss sss sts ss 

315 320 

a est e si sss sss ss: ss; sss us sss sss ;s a ss ss: 

330 335 340 

caa ctg tta aat gca agt ccg ata acc gat cct gtc gcg get age tga 
Gin Leu Leu Asn Ala Ser Pro lie Thr Asp Pro Val Ala Ala Ser 

350 355 
ccgtggccgc aacaatgaga acccgatatt tcttcctttg ggttgtgatt gtaacttaaa 1228 
aggagacttt ttgtttttat tcttagattt gtagctctct gcatagtgag cataaattga 1288 

Page 27 



928 



976 



1024 



1072 



1120 
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tgtaatatgg tttaagagat tcggtgttct ctggtgtgtg ctgcaaccac ataattggtg 134 8 
atagataggt ttagttatat aagcaaatgt attagagata aggggagaca tatttgatgg 1408 



tcttt 



<210> 24 

<211> 360 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 24 

Met Gly Ser Asn Glu Glu Gly Asn Pro Thr Asn Asn Ser Asp Lys Pro 
1 5 10 15 

Ser Gin Ala Ala Ala Pro Glu Gin Ser Asn Val His Val Tyr His His 
20 25 30 

Asp Trp Ala Ala Met Gin Ala Tyr Tyr Gly Pro Arg Val Gly He Pro 
35 40 45 

Gin Tyr Tyr Asn Ser Asn Leu Ala Pro Gly His Ala Pro Pro Pro Tyr 
50 55 60 

Met Trp Ala Ser Pro Ser Pro Met Met Ala Pro Tyr Gly Ala Pro Tyr 
65 70 75 60 

Pro Pro Phe Cys Pro Pro Gly Gly Val Tyr Ala His Pro Gly Val Gin 
85 90 95 

Met Gly Ser Gin Pro Gin Gly Pro Val Ser Gin Ser Ala Ser Gly Val 
100 105 HO 

Thr Thr Pro Leu Thr He Asp Ala Pro Ala Asn Ser Ala Gly Asn Ser 
115 120 125 

Asp His Gly Phe Met Lys Lys Leu Lys Glu Phe Asp Gly Leu Ala Met 
130 135 140 

Ser He Ser Asn Asn Lys Val Gly Ser Ala Glu His Ser Ser Ser Glu 
145 150 155 160 

His Arg Ser Ser Gin Ser Ser Glu Asn Asp Gly Ser Ser Asn Gly Ser 
165 170 175 

Asp Gly Asn Thr Thr Gly Gly Glu Gin Ser Arg Arg Lys Arg Arg Gin 
180 185 190 

Gin Arg Ser Pro Ser Thr Gly Glu Arg Pro Ser Ser Gin Asn Ser Leu 
195 200 205 

Pro Leu Arg Gly Glu Asn Glu Lys Pro Asp Val Thr Met Gly Thr Pro 
210 215 220 

Val Met Pro Thr Ala Met Ser Phe Gin Asn Ser Ala Gly Met Asn Gly 
225 230 235 240 



1413 
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val Pro Gin Pro Trp Asn Glu Lys Glu Val Lys Arg Glu Ly8 Arg Lys 
245 250 



Gin Ser Asn Arg Glu Ser Ala Arg Arg Ser Arg Leu Arg Lys Gin Ala 
260 265 

Glu Thr Glu Gin Leu Ser Val Lys Val Asp Ala Leu Val Ala Glu Asn 
275 280 

Met Ser Leu Arg Ser Lys Leu Gly Gin Leu Asn Asn Glu Ser Glu Lys 
290 295 3 

Leu Arg Leu Glu Asn Glu Ala lie Leu Asp Gin Leu Lys Ala Gin Ala 



305 



310 



Thr Gly Lys Thr Glu Asn Leu lie Ser Arg Val Asp Lys Asn Asn Ser 

Val Ser Gly Ser Lys Thr Val Gin His Gin Leu Leu Asn Ala Ser 'Pro 
340 345 

lie Thr Asp Pro Val Ala Ala Ser 
355 360 



<210> 25 

<211> 1087 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (104) . . (952) 

<223> G664 



lUlZccll. at.,.,..t, a«.9»9.. ttg.tctgcc uutacug .tttcsmt 
9 a..e» 9 .t =«..««, .g...9>».9 ta.acf. W «g gj J9| |» 



IS £ g< E I" E S S S K f E S B t" E 

is is e a ss e is ss s. ss s s e k f 5 

S S S S5 SI S S K IS? S5 SI S f "1 IS 

40 45 

s: si s a as s s £ a s - « £ » 21 b 

XI » S SS B B 15 B 22 K if. SI ffi K SI 
SI R SS K 8 s3 S S IS 1!? SI £11 » B SI S 

85 " 90 95 

gat aac gag ata aag aac tat tgg aac acg cat ata cga aga aag ctt 
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211 



259 



307 



355 



403 
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Asp Asn Glu lie Lys Asn Tyr Trp Asn Thr His lie Arg Arg Lys Leu 
105 110 115 

ata aac aga ggg att gat cca acg agt cat aga cca ate caa gaa tea 499 
He Asn Arg Gly He Asp Pro Thr Ser His Arg Pro lie Gin Glu Ser 
120 125 130 

tea get tct caa gat tct aaa cct aca caa eta gaa cca gtt acg agt 547 
Ser Ala Ser Gin Asp Ser Lys Pro Thr Gin Leu Glu Pro Val Thr Ser 
135 140 145 

aat ace att aat ate tea ttc act tct get cca aag gtc gaa acg ttc 595 
Asn Thr He Asn He Ser Phe Thr Ser Ala Pro Lys Val Glu Thr Phe 
150 155 160 

cat gaa agt ata age ttt ccg gga aaa tea gag aaa ate tea atg ctt 643 
His Glu Ser He Ser Phe Pro Gly Lys Ser Glu Lys He Ser Met Leu 
165 170 175 180 

acg ttc aaa gaa gaa aaa gat gag tgc cca gtt caa gaa aag ttc cca 691 
Thr Phe Lys Glu Glu Lys Asp Glu Cys Pro Val Gin Glu Lys Phe Pro 
185 190 195 

gat ttg aat ctt gag etc aga ate agt ctt cct gat gat gtt gat cgt 739 
Asp Leu Asn Leu Glu Leu Arg He Ser Leu Pro Asp Asp Val Asp Arg 
200 " 205 210 



ctt caa ggg cat gga aag tea aca acg cca cgt tgt ttc aag tgc age 
Leu Gin Gly His Gly Lys Ser Thr Thr Pro Arg Cys Phe Lys Cys Ser 
215 220 225 



gat gta gtc gga ggt age age aag ggg agt gac atg age aat gga ttt 
Asp Val Val Gly Gly Ser Ser Lys Gly Ser Asp Met Ser Asn Gly Phe 
245 250 255 260 



787 



tta ggg atg ata aac ggc atg gag tgc aga tgc gga aga atg aga tgc 835 
Leu Gly Met He Asn Gly Met Glu Cys Arg Cys Gly Arg Met Arg Cys 
230 235 240 



883 



gat ttt tta ggg ttg gca aag aaa gag ace act tct ctt ttg ggc ttt 931 
Asp Phe Leu Gly Leu Ala Lys Lys Glu Thr Thr Ser Leu Leu Gly Phe 
265 270 275 

cga age ttg gag atg aaa taa tattgtcaaa ttttaggcgt aactgtacaa 982 
Arg Ser Leu Glu Met Lys 
280 

aacttttgee tagataattt gaaagtatat cttcaacttg tatgagaaat ttaactggtg 1042 
aattataata tatagaattt gttttttaaa aaaaaaaaaa aaaaa 1087 

<210> 26 
<211> 282 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 26 

Met Gly Arg Ser Pro Cys Cys Glu Lys Ala His Thr ABn Lys Gly Ala 
1*5 10 15 

Trp Thr Lys Glu Glu Asp Glu Arg Leu Val Ala Tyr He Lys Ala His 
20 25 30 

Gly Glu Gly Cys Trp Arg Ser Leu Pro Lys Ala Ala Gly Leu Leu Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp He Asn Tyr Leu Arg Pro Asp 
50 55 60 
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Leu Lys Arg Gly Asn Phe Thr Glu Glu Glu Asp Glu Leu He He Lys 



65 70 



L eu His ser Leu Leu Gly Asn Lys Trp Ser Leu lie Ala Gly Arg Leu 
85 90 



Pro Gly Arg Thr Asp Asn Glu He Lys Asn Tyr Trp Asn Thr His He 
100 10b 

Arg Arg Lys Leu lie Asn Arg Gly He Asp Pro Thr Ser His Arg Pro 

115 120 

lie Gin Glu Ser Ser Ala Ser Gin Asp Ser Lys Pro Thr Gin Leu Glu 



130 



135 



p ro val Thr ser Asn Thr lie Asn He Ser Phe Thr Ser Ala Pro Lys 

Val Glu Thr Phe His Glu Ser He Ser Phe Pro Gly Lys Ser Glu Lys 
165 170 

He Ser Met Leu Thr Phe Lys Glu Glu Lys Asp Glu Cys Pro Val Gin 
180 185 



Glu Lys Phe Pro Asp Leu Asn Leu Glu Leu Arg lie Ser Leu Pro Asp 
.195 200 

Asp Val Asp Arg Leu Gin Gly His Gly Lys Ser Thr Thr Pro Arg Cys 
210 215 

Phe Lys Cys Ser Leu Gly Met He Asn Gly Met Glu Cys Arg Cys Gly 
225 230 ^ 

Arg Met Arg Cys Asp Val Val Gly Gly Ser Ser Lys Gly Ser Asp" Met 
245 250 

Gly Phe Asp Phe Leu Gly Leu Ala Lys Lys Glu Thr Thr Ser 



Ser Asn c 

260 265 



Leu Leu Gly Phe Arg Ser Leu Glu Met Lys 
275 280 



<210> 27 

<211> 228 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (l).-(228) 

<223> G682 



sss s s a s s a s w s = = !!« ? "> " 
s a ss ss c e a s s s 3 e f a ss 



96 



20 25 
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atg agt caa gaa gaa gaa gat ttg gtc tct cga atg cat aag ctt gtc 144 
Met Ser Gin Glu Glu Glu Asp Leu Val Ser Arg Met His Lys Leu Val 
35 40 45 



ggt gac agg tgg gaa ctg ata get ggg agg ate cca gga aga acc get 
Gly'Asp Arg Trp Glu Leu He Ala Gly Arg He Pro Gly Arg Thr Ala 
50 55 60 

gga gaa att gag agg ttt tgg gtc atg aaa aat tga 
Gly Glu He Glu Arg Phe Trp Val Met Lys Asn 
65 70 75 



<210> 28 

<2H> 75 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 28 

Met Asp Asn His Arg Arg Thr Lys Gin Pro Lys Thr Asn Ser He Val 
1 5 10 15 

Thr Ser Ser Ser Glu Glu Val Ser Ser Leu Glu Trp Glu Val Val Asn 
20 25 30 

Met Ser Gin Glu Glu Glu Asp Leu Val Ser Arg Met His Lys Leu Val 
35 40 45 

Gly Asp Arg Trp Glu Leu He Ala Gly Arg He Pro Gly Arg Thr Ala 
50 55 60 

Gly Glu He Glu Arg Phe Trp Val Met Lys Asn 

75 



65 


70 


<210> 


29 


<211> 


480 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(1) . . (480) 


<223> 


G911 



<400> 29 

atg ggt ctt cct gaa gat ttc ate acc gag ctt cag att cca ggt tac 

Met Gly Leu Pro Glu Asp Phe He Thr Glu Leu Gin He Pro Gly Tyr 

1 5 10 15 

ata tta aag ata ctt tac gtc ate ggt ttc ttt aga gac atg gtc gat 

He Leu Lys He Leu Tyr Val He Gly Phe Phe Arg Asp Met Val Asp 
20 25 30 



50 * 55 60 

ctt get aac gag ttg ate ccg gtg gtt egg ttc teg gat ctt ccg acc 

Leu Ala Asn Glu Leu He Pro Val Val Arg Phe Ser Asp Leu Pro Thr 

65 70 75 80 

gat ccg gaa gat tgt tgt acg gtt tgt ttg tea gat ttt gag tec gac 

Asp Pro Glu Asp Cys Cys Thr Val Cys Leu Ser Asp Phe Glu Ser Asp 
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192 



228 



48 



96 



get ctt tgt cct tac att ggt eta cct agt ttt eta gac cac aac gag 144 

Ala Leu Cys Pro Tyr He Gly Leu Pro Ser Phe Leu Asp His Asn Glu 
35 40 45 

acc tct gga ccc gat ccg acc cga cac get etc tct acg tea gcg agt 192 

Thr Ser Gly Pro Asp Pro Thr Arg His Ala Leu Ser Thr Ser Ala Ser 



240 



288 
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85 90 95 



B "i e a a = - " 9 - i « ss ?;? s if; stl s " 

100 105 

^ a c s 5g s e c 5 ss s » i s >s e 

115 120 X ^ 

$ 31 S 3 S S S 5 K W S g «■ S K S 

r, k a e s s a s s a e s s s " 

145 150 ^ 



<210> 30 

<211> 159 

<212> PRT 

<213> Arabidopsis thaliana 



Met Gly Leu Pro Glu Asp Phe He Thr Glu Leu Gin lie Pro Gly Tyr 



<400> 30 

Met Gly Le 
1 5 

lie Leu Lys He Leu Tyr Val He Gly Phe Phe Ar 9 Asp Met Val Asp 
20 25 

Ala Leu Cys Pro Tyr lie Gly Leu Pro Ser Phe Leu Asp His Asn Glu 



35 



40 



Thr Ser Gly Pro Asp Pro Thr Arg His Ala Leu Ser Thr Ser Ala Ser 
50 55 

Leu Ala Asn Glu Leu lie Pro Val Val Arg Phe Ser Asp Leu Pro Thr 
65 70 75 

Asp Pro Glu Asp Cys Cys Thr Val Cys Leu Ser Asp Phe Glu Ser Asp 



85 



Asp Lys Val Arg Gin Leu Pro Lys Cys Gly His Val Phe His His His 



100 



Cys Leu Asp Arg Trp lie Val Asp Tyr Asn Lys Met Lys Cys Pro Val 



115 



Cys Arg His Arg Phe Leu Pro Lys Glu Lys Tyr Thr Gin Cys Asp Trp 



130 135 



Gly Ser Gly Ser Asp Trp Phe Ser Asp Glu Val Glu Ser Thr Asn 
145 150 155 



<210> 31 

<211> 1221 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (162) . . (1013) 

<223> G964 
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<400> 31 

atttctcttc cacaaagagt cctaacttcg agttgaaaca aacaccattt ctcatctcta 60 

tctcagaaag aacaaaccat ttcgtgttct ttctttctct attctcataa ggaaatataa 120 

ttcctgaaac tgttgagttc ttgtgaaagg aaataaaaaa c atg atg atg ggc aaa 176 

Met Met Met Gly Lys 
1 5 



gaa gat eta ggt ttg age eta age tta ggg ttt tea caa aat cac aat 
Glu Asp Leu Gly Leu Ser Leu Ser Leu Gly Phe Ser Gin Asn His Asn 
10 15 20 



cag aga etc cca tgg aac caa aca ttc gat cct aca tea gat ctt cgc 
Gin Arg Leu Pro Trp Asn Gin Thr Phe Asp Pro Thr Ser Asp Leu Arg 
40 45 50 



tea gat gaa gaa gaa gac ggg ggc gaa acg teg agg aag aag etc agg 
Ser Asp Glu Glu Glu Asp Gly Gly Glu Thr Ser Arg Lys Lys Leu Arg 
120 125 130 



224 



cct ctt cag atg aat ctg aat cct aac tct tea tta tea aac aat etc 272 
Pro Leu Gin Met Asn Leu Asn Pro Asn Ser Ser Leu Ser Asn Asn Leu 
25 30 35 



320 



aag ata gac gtg aac agt ttt cca tea acg gtt aac tgc gag gaa gac 368 

Lys He Asp Val Asn Ser Phe Pro Ser Thr Val Asn Cys Glu Glu Asp 
55 60 65 

aca gga gtt teg tea cca aac agt acg ate tea age acc att age ggg 416 

Thr Gly Val Ser Ser Pro Asn Ser Thr He Ser Ser Thr He Ser Gly 
70 " 75 80 85 

aag aga agt gag aga gaa gga ate tec gga acc ggc gtt ggc tec ggc 464 

Lys Arg Ser Glu Arg Glu Gly He Ser Gly Thr Gly Val Gly Ser Gly 
90 95 100 

gac gat cac gac gag ate act ccg gat cga ggg tac tea cgt gga acc 512 

Asp Asp His Asp Glu He Thr Pro Asp Arg Gly Tyr Ser Arg Gly Thr 

105 110 115 



560 



tta tea aaa gat cag tct get ttt etc gaa gag act ttc aaa gaa cac 608 
Leu Ser Lys Asp Gin Ser Ala Phe Leu Glu Glu Thr Phe Lys Glu His 
135 " 140 145 

aac act etc aat ccc aaa cag aag eta get ttg get aag aag ctg aac 656 
Asn Thr Leu Asn Pro Lys Gin Lys Leu Ala Leu Ala Lys Lys Leu Asn 
150 155 160 165 

ttg acg gca aga caa gtg gaa gtg tgg ttc caa aac aga aga get aga 
Leu Thr Ala Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg Ala Arg 
170 175 180 

acc aag tta aag caa acg gag gta gat tgc gaa tac ttg aaa egg tgc 
Thr Lys Leu Lys Gin Thr Glu Val Asp Cys Glu Tyr Leu Lys Arg Cys 
185 190 195 

gta gag aag eta acg gaa gag aac egg aga ctt cag aaa gag get atg 
Val Glu Lys Leu Thr Glu Glu Asn Arg Arg Leu Gin Lys Glu Ala Met 
200 205 ~ 210 

gag ctt cga act etc aag ctg tct cca caa ttc tac ggt cag atg act 848 
Glu Leu Arg Thr Leu Lys Leu Ser Pro Gin Phe Tyr Gly Gin Met Thr 
215 220 225 

cca cca act aca etc ate atg tgt cct teg tgc gag cgt gta get ggt 896 
Pro Pro Thr Thr Leu He Met Cys Pro Ser Cys Glu Arg Val Ala Gly 
230 235 240 245 

cca tea tea teg aac cat cac cac aat cac agg ccg gtt teg att aac 944 
Pro Ser Ser Ser Asn His His His Asn His Arg Pro Val Ser lie Asn 
250 255 260 



704 



752 



800 
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s ss s e & si is a e m s if? a s s sa 



35 



Thr Ser Asp Leu Arg Lys lie Asp Val Asn Ser Phe Pro Ser Thr VaX 
50 55 6° 

Asn Cys Glu Glu Asp Thr Gly Val Ser Ser Pro Asn Ser Thr He Ser 
6S 7 ° 75 

ser Thr lie Ser Gly Lys Arg Ser Glu Arg Glu Gly He Ser Gly Thr 
85 90 

Gly val Gly Ser Gly Asp Asp His Asp Glu He Thr Pro Asp Arg Gly 
100 105 

Tyr Ser Arg Gly Thr Ser Asp Glu Glu Glu Asp Gly Gly Glu Thr Ser 



115 



Arg Lys Lys Leu Arg Leu Ser Lys Asp Gin Ser Ala Phe Leu Glu Glu 
130 135 A 

Thr Phe Lys Glu His Asn Thr Leu Asn Pro Lys Gin Lys Leu Ala Leu 



145 



Ala Lys Lys Leu Asn Leu Thr Ala Arg Gin Val Glu Val Trp Phe Gin 
165 170 

Asn Arg Arg Ala Arg Thr Lys Leu Lys Gin Thr Glu Val Asp Cys Glu 

Tyr Leu Lys Arg Cys Val Glu Lys Leu Thr Glu Glu Asn Arg Arg Leu 
* 195 200 w 

Gin Lys Glu Ala Met Glu Leu Arg Thr Leu Lys Leu Ser Pro Gin Phe 
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1043 



1221 



gcc ttg cgt cca cga teg taa tttttagtgg tgggggaagg gtgttttggg 
Ala Leu Arg Pro Arg Ser 
280 

ttttttcatt ategttatat agtctatctg tgtggggtca ttgtaatttt ggatgattgg 1103 
ccttctcatg aactagtcat atgtatgatg caaccttaaa aatatttcaa gtagcaaaac 1163 
ttaattacaa acttgetata ttaaccaaaa attatgaaaa aaaaaaaaaa aaaaaaaa 

<210> 32 
<211> 283 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 32 

Met Met Met Gly Lys Glu Asp Leu Gly Leu Ser Leu Ser Leu Gly Phe 
1 5 10 

Ser Gin Asn His Asn Pro Leu Gin Met Asn Leu Asn Pro Asn Ser Ser 
20 25 

Leu Ser Asn Asn Leu Gin Arg Leu Pro Trp Asn Gin Thr Phe Asp Pro 
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MBI16 Sequence Listing. ST25 
210 215 220 

Tyr Gly Gin Met Thr Pro Pro Thr Thr Leu lie Met Cys Pro Ser Cys 
225 230 235 240 

Glu Arg Val Ala Gly Pro Ser Ser Ser Asn His His His Asn His Arg 
245 250 255 

Pro Val Ser lie Asn Pro Trp He Ala Cys Ala Gly Gin Val Ala His 
260 265 270 

Gly Leu Asn Phe Glu Ala Leu Arg Pro Arg Ser 
275 280 

<210> 33 

<211> 1249 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (82).. (918) 

<223> G394 

<400> 33 

gaaattctta acaaacaatt ttcttcataa tattaattct caagatctta aagattatat 60 

taatacgaag agaaaattca a atg ggt ctt gat gat tea tgc aac aca ggt 111 

Met Gly Leu Asp Asp Ser Cys Asn Thr Gly 
15 10 

ctt gtt ctt ggt tta ggc etc tea cca acg cct aat aat tac aat cat 159 
Leu val Leu Gly Leu Gly Leu Ser Pro Thr Pro Asn Asn Tyr Asn His 
15 20 25 

gec ate aag aaa tct tec tec act gtg gac cat cgt ttc ate agg etc 207 
Ala He Lys Lys Ser Ser Ser Thr Val Asp His Arg Phe He Arg Leu 
30 35 40 

gat ccg teg ttg act eta age eta tec ggt gag age tac aag ate aag 255 
Asp Pro Ser Leu Thr Leu Ser Leu Ser Gly Glu Ser Tyr Lys He Lys 
45 50 55 

act ggt gee ggc gee ggc gac caa att tgc egg cag ace teg tec cac 303 
Thr Gly Ala Gly Ala Gly Asp Gin He Cys Arg Gin Thr Ser Ser His 
60 65 70 

age ggc ate tea tct ttc teg age gga agg gta aag aga gaa aga gaa 351 
Ser Gly He Ser Ser Phe Ser Ser Gly Arg Val Lys Arg Glu Arg Glu 
75 80 85 90 

ate tec ggc ggc gat gga gaa gaa gag gcg gag gag acg acg gag aga 399 
He Ser Gly Gly Asp Gly Glu Glu Glu Ala Glu Glu Thr Thr Glu Arg 
95 100 105 

gtg gtg tgt teg aga gtg agt gat gat cat gac gat gaa gaa ggt gtt 447 
val Val Cys Ser Arg Val Ser Asp Asp His Asp Asp Glu Glu Gly Val 
110 115 120 

agt get cgt aaa aag ctt aga etc act aaa caa caa tct get ctt etc 495 
Ser Ala Arg Lys Lys Leu Arg Leu Thr Lys Gin Gin Ser Ala Leu Leu 
125 130 135 

gaa gat aac ttc aaa ctt cat age ace ctt aat ccc aag caa aaa caa 543 
Glu Asp Asn Phe Lys Leu His Ser Thr Leu Asn Pro Lys Gin Lys Gin 
140 145 150 

get ctt gcg aga cag ctg aat eta agg cct aga caa gtt gaa gtg tgg 591 
Ala Leu Ala Arg Gin Leu Asn Leu Arg Pro Arg Gin Val Glu Val Trp 
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155 ^0 1« 

s t. » a ss m is s s w s « g « 
s e s s S s s s: ir s a a e g = « 

Y 190 195 

SI S S 5 « 25 K g S W E «" £ » = = 

205 210 
CC9 t« tac atg ca= atg cog gcg gcg act ttg act atg tgc cot tot 
Pro Phe Tyr Met His Met Pro Ala Ala mr ^ 
220 Z " 

gj e a a b « K s e a; g e a s ». g 

235 240 

K £ E S g S I?? E 5S S S S S W g & 

s.s is; s » s a s g e k s " 9 ™' 
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9 ttatttaat tctttttgtt ggtttttttt ttgtttctta aatcaaatta 99 aattagtt 
a9 aa 9 ataaa t=ccagg 9 aa aaaatattac gttgaaattg ggg 9 9aaatg gggtatagtc 
cttatagata agactcttca acgattccac tttatttttc 9 gtg 99 attg ttggttgatg 
aagaaaaaaa aatagtttgt aattaca 99 t ttaaatat 9 t a 9 agaaaaaa tgacgaatat 
gtattatctt gttttttttt ccttcgaata tgtattacgg taatataaat ttgcttgtaa 
aaataataaa tatattattt g 



<210> 34 

<211> 278 

<212> PRT 

<213> Arabidopsis thaliana 



^ „ » Q « Thr Giv Leu val Leu Gly Leu Gly 
Met Gly Leu Asp Asp Ser Cys Asn Thr Gly ueu lg 

1 5 

L eu ser Pro Thr Pro Asn Asn Tyr Asn His Ala He Lys Lys Ser Ser 
20 " 

«w ,i. BW tm asd Pro Ser Leu Thr Leu 
Ser Thr Val Asp His Arg Phe lie Arg Leu Asp Pro ^ 

35 40 
S er Leu Ser Gly Glu Ser Tyr Lys He Lys Thr Gly Ala Gly Ala Gly 

50 5!? 

Asp Gin Xle cys Arg Gin Thr Ser Ser His Ser Gly He Ser Ser Phe 
65 70 

Ser Ser Gly Arg Val Lys Ar. Glu Arg Glu He Ser Gly Gly Asp Gly 
85 



639 

687 

735 

783 

831 

879 

928 

988 
1048 
1108 
1168 
1228 
1249 



Olu Glu Glu Ala Glu Glu Thr Thr Glu Ar. Val Val Cys Ser Arg Val 
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100 . 105 HO 

Ser Asp Asp His Asp Asp Glu Glu Gly Val Ser Ala Arg Lys Lys Leu 
115 120 125 

Arg Leu Thr Lys Gin Gin Ser Ala Leu Leu Glu Asp Asn Phe Lys Leu 
130 135 140 

His Ser Thr Leu Asn Pro Lys Gin Lys Gin Ala Leu Ala Arg Gin Leu 
145 150 155 160 

Asn Leu Arg Pro Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg Ala 
165 170 175 

Arg Thr Lys Leu Lys Gin Thr Glu Val Asp Cys Glu Phe Leu Lys Lys 
180 185 190 

Cys Cys Glu Thr Leu Thr Asp Glu Asn Arg Arg Leu Gin Lys Glu Leu 
195 200 205 

Gin Asp Leu Lys Ala Leu Lys Leu Ser Gin Pro Phe Tyr Met His Met 
210 215 220 

Pro Ala Ala Thr Leu Thr Met Cys Pro Ser Cys Glu Arg Leu Gly Gly 
225 230 235 240 

Gly Gly Val Gly Gly Asp Thr Thr Ala Val Asp Glu Glu Thr Ala Lys 
245 250 255 

Gly Ala Phe Ser lie Val Thr Lys Pro Arg Phe Tyr Asn Pro Phe Thr 
260 265 270 

Asn Pro Ser Ala Ala Cys 
275 

<210> 35 

<211> 1147 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (33).. (695) 

<223> G489 

<400> 35 

tggatcaaca agaccatgga cagtctggag ct atg aac tat ggc aca aac cca 53 

Met Asn Tyr Gly Thr Asn Pro 
1 5 

tac caa acc aac ccg atg age acc act get get act gta gca gga ggt 101 
Tyr Gin Thr Asn Pro Met Ser Thr Thr Ala Ala Thr Val Ala Gly Gly 
10 15 20 

gcg gca caa cca ggc cag ctg gcg ttc cac cag ate cat cag cag cag 149 
Ala Ala Gin Pro Gly Gin Leu Ala Phe His Gin He His Gin Gin Gin 
25 30 35 

cag cag caa cag ctg gca cag cag ctt caa gca ttt tgg gag aac caa 197 
Gin Gin Gin Gin Leu Ala Gin Gin Leu Gin Ala Phe Trp Glu Asn Gin 
40 45 50 55 
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ttc aaa aaq att gag aag act acc gat ttc aag aac cac age ctt ccc 24 :> 

Phe Lys G?u Ue III Lys Thr Thr Asp Phe Lys Asn His Ser Leu Pro 
60 65 

ctt gcg aga ate aag aaa ate atg aaa gcg gat gaa gat gtc cgt atg 
Leu All Arg lie Lys Lys lie Met Lys Ala Asp Glu Asp Val Arg Met 
75 80 Bb 

ate tcq get gag gcg ccg gtc gtg ttt gca agg gee tgt gag atg ttc 
Ue Ser A^a Glu All PrI Val Val Phe Ala Arg Ala Cys Glu Met Phe 
90 95 100 

'At a a S S 25 m 3 SI £ B S S! S! £ W 



105 I 10 115 

si si s a a a a a a a a a a s s s 

120 125 130 

na . att ttt qat ttc ctt gtg gat att gtt ccc egg gag gat etc cga 

£sp Ue Phe lip Phe Leu la? Asp lie Val Pro Arg Glu Asp Leu Arg 

140 145 

oat aaa qtc ttg gga agt att ccg agg ggc act gtc ccg gaa get get 
jgp Glu Val Leu Gly sir lie Pro Arg Gly Thr Val Pro Glu Ala Ala 

art act aat tac ccg tat gga tac ttg cct gca gga act get cca at a 
lit 111 lly 4r Pro iyr Sly Tyr Leu Pro Ala Gly Thr Ala Pro lie 



140 145 

gga agt att ccg agg 
Gly Ser He Pro Arg 
isB * 160 

tac ttg 
Tyr Leu 

170 175 180 

aaa aat ccq qqa atg gtt atg ggt aat ccc ggt ggt gcg tat cca cct 
|?y Asn Pro Gly Me? Val Me? Gly Aen Pro Gly Gly Ala Tyr Pro Pro 
185 190 I 95 

*at cct tat atq qgt caa cca atg tgg caa caa cag gca cct gac caa 
Asn Pro ^r Se? Gly Gin Pro Me? Trp Gin Gin Gin Ala Pro Asp Gin 
200 205 210 215 



cct gac cag gaa aat tag caagaaactg tgagtcttcc agettcgegg 
Pro Asp Gin Glu Asn 
220 

ccgctctaga caggcctcgt accggatcct ctagctagag etttegtteg tatcateggt 
ttcgacaacg ttcgtcaagt teaatgeate agtttcattg cgcacacacc agaatcctac 
tgagtttgag tattatggca ttgggaaaac tgtttttctt gtccatttgt tgtgcttgta 
atttactgtg ttttttattc ggttttcgct atcgaactgt gaaatggaaa tggatggaga 
agagttaatg aatgatatgg ccttttgttc attctcaaat taatattatt tggtttttct 
cttatttgtg gggatgaatt tgaaattata agagatatgc aaacattttg tttgagtaaa 
atgtgcaaat cgtggcctct aatgacccga agttaatatg aggagtaaaa cacttgtagg 

tg 

<210> 36 

<211> 220 

<212> PRT 

<213> Arabidopsis thaliana 



<400> 36 

Met Asn T_ 
1 5 



Met Asn Tyr Gly Thr Asn Pro Tyr Gin Thr Asn Pro Met Ser Thr Thr 



Ala Ala Thr Val Ala Gly Gly Ala Ala Gin Pro Gly Gin Leu Ala Phe 



293 

341 

389 

437 

485 

533 

581 

629 

677 

725 

785 
845 
905 
965 
1025 
1085 
1145 
1147 



20 

Page 39 



WO 01/36598 



PCT/US00/31458 



MBI16 Sequence Listing. ST25 

His Gin He His Gin Gin Gin Gin Gin Gin Gin Leu Ala Gin Gin Leu 
35 40 45 

Gin- Ala Phe Trp Glu Asn Gin Phe Lys Glu He Glu Lys Thr Thr Asp 
50 55 60 

Phe Lys Asn His Ser Leu Pro Leu Ala Arg He Lys Lys He Met Lys 
65 70 75 80 

Ala Asp Glu Asp Val Arg Met He Ser Ala Glu Ala Pro Val Val Phe 
85 90 95 

Ala Arg Ala Cys Glu Met Phe He Leu Glu Leu Thr Leu Arg Ser Trp 
100 105 HO 

Asn His Thr Glu Glu Asn Lys Arg Arg Thr Leu Gin Lys Asn Asp He 
115 120 125 

Ala Ala Ala Val Thr Arg Thr Asp He Phe Asp Phe Leu Val Asp He 
130 135 140 

Val Pro Arg Glu Asp Leu Arg Asp Glu Val Leu Gly Ser He Pro Arg 
145 ~ 150 155 160 

Gly Thr Val Pro Glu Ala Ala Ala Ala Gly Tyr Pro Tyr Gly Tyr Leu 
165 170 175 

Pro Ala Gly Thr Ala Pro He Gly Asn Pro Gly Met Val Met Gly Asn 
180 185 190 

Pro Gly Gly Ala Tyr Pro Pro Asn Pro Tyr Met Gly Gin Pro Met Trp 
195 200 205 

Gin Gin Gin Ala Pro Asp Gin Pro Asp Gin Glu Asn 
210 215 220 

<210> 37 

<211> 1262 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (217).. (957) 

<223> G463 

<400> 37 

ctcgagctac gtcaggggtc tctttctgtt tgtttgtttt cttgtttcct tctctctctc 60 

tctttctttc tttgtcttcc tttcccaggt tgtttttttt tgctctctct gccttcttga 120 

ctttcaaaag actctttctt tcttttggat tgattttgga ttctagggct ctctttcttt 180 

tagtgggttt ttgttgttgt tgttgtggtc tctctg atg att act gaa ctt gag 234 

Met He Thr Glu Leu Glu 
1 5 

atg ggg aaa ggt gag agt gag ctt gag ctt ggt eta ggg ctg agt ctt 282 
Met Gly Lys Gly Glu Ser Glu Leu Glu Leu Gly Leu Gly Leu Ser Leu 
10 15 20 
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a? E E S E E E 5S E E E « f « « - 

25 30 

s E as s k a s s a e si e ^ e e e 

E S E S E E E S E IS E K E S S E 

55 60 65 

E E E S S! B C E E E E S E S K E 

75 80 

» E E E E E E E E S E S E E E E 

90 95 

e b e e a a e a e e e e e e e e 

105 110 

S E E a E E E e e e E e s e « E 

120 125 1JU 

S3 E E E B E E E E O E E E E E E 

135 * 40 

tct tac gag aat ttg gcg caa aca ttg gaa gat atg ttc ttt cgc act 
Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu Asp Met Phe Pne Arg Tnr 
155 lot) 

E E E E E E E E EE S S E E E E 

170 175 

E E K E S3 E E IS E E S E E E E E 

185 190 195 

a; e e a s e E s e e 3 e si? e e e 

200 205 
*-™ nt-r, aaa aaa eta cat gtg atg aaa acc tct gaa get aat gga etc 
HI 93 Lys S Leu £rg Sa? Me? Lys Thr Ser Glu Ala Asn Gly Leu 
215 220 

K E E E E E E E E E E E E ffi E E 

235 240 
tag atctcttttc gaegttaegg tgttacaggt tttatatttt ggggttttgc 
aagtctgaga tacttctgaa gcaagcataa gctagattga tcttatatcc agtttgtgta 
ttttcttggt tcttataatg gtttttactg gttttcttta gttttttttt ttgctgtctt 
ttaattttcg gttgcgattt cactatatac tatggatgga agagaatget ctttatatct 
tttactacac tgtaaatatt tgaagcttat etaatategt ttttaagggt taaaaaaccc 
tgaegtagee tcgag 



330 

378 

426 

474 

522 

570 

618 

666 

714 

762 

810 

858 

906 

954 

1007 
1067 
1127 
1187 
1247 
1262 



<210> 38 

<211> 246 

<212> PRT 

<213> Arabidopsis thaliana 



<400> 38 
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Met He Thr Glu Leu Glu Met Gly Lys Gly Glu Ser Glu Leu Glu Leu 
15 10 15 

Gly Leu Gly Leu Ser Leu Gly Gly Gly Thr Ala Ala Lys He Gly Lys 
20 25 30 

Ser Gly Gly Gly Gly Ala Trp Gly Glu Arg Gly Arg Leu Leu Thr Ala 
35 40 45 

Lys Asp Phe Pro Ser Val Gly Ser Lys Arg Ala Ala Asp Ser Ala Ser 
50 55 60 

His Ala Gly Ser Ser Pro Pro Arg Ser Ser Gin Val Val Gly Trp Pro 
65 ^ 70 75 80 

Pro He Gly Ser His Arg Met Asn Ser Leu Val Asn Asn Gin Ala Thr 
85 ^ .90 95 

Lys Ser Ala Arg Glu Glu Glu Glu Ala Gly Lys Lys Lys Val Lys Asp 
100 105 HO 

Asp Glu Pro Lys Asp Val Thr Lys Lys Val Asn Gly Lys Val Gin Val 
115 120 125 

Gly Phe He Lys Val Asn Met Asp Gly Val Ala He Gly Arg Lys Val 
130 135 140 

Asp Leu Asn Ala His Ser Ser Tyr Glu Asn Leu Ala Gin Thr Leu Glu 
145 150 155 160 

Asp Met Phe Phe Arg Thr Asn Pro Gly Thr Val Gly Leu Thr Ser Gin 
165 170 175 

Phe Thr Lys Pro Leu Arg Leu Leu Asp Gly Ser Ser Glu Phe Val Leu 
180 185 190 

Thr Tyr Glu Asp Lys Glu Gly Asp Trp Met Leu val Gly Asp Val Pro 
195 200 205 

Trp Arg Met Phe He Asn Ser Val Lys Arg Leu Arg Val Met Lys Thr 
210 215 220 

Ser Glu Ala Asn Gly Leu Ala Ala Arg Asn Gin Glu Pro Asn Glu Arg 
225 230 235 240 

Gin Arg Lys Gin Pro Val 
245 



<210> 39 

<211> 905 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (76).. (837) 

<223> G767 
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cjaagtttca actttcaaac atatctttac agttctttct tgctaaacaa caataaaggg 60 

gaaaataggt taatt atg atg aaa tct ggg get gat ttg caa ttt cca cca 
Met Met Lys Ser Gly Ala Asp Leu Gin Phe Pro Pro 



1 



5 10 



K £ S S SS IS S S E! SB £ IS 25 S & S 

15 20 25 

tot cat aaa tgc gcg teg cag ccg ate cct get ccg att ate ace gaa 
£s Its Lys C?s All Ser Gin Pro lie Pro Ala Pro lie lie Thr Glu 
30 35 40 

etc qat ttq tac cga tat gat cct tgg gac ctt ccc gac atg get ttg 
Leu Asp Leu 5yr Arg Tyr Asp Pro Trp Asp Leu Pro Asp Met Ala Leu 
45 50 55 &u 

tac oat aaa aag gag tgg tat ttt ttc tea cca aga gat cga aag tat 
§6 Glu lys Gil Trl Tyr Phe Phe Ser Pro Arg Asp Arg Lys Tyr 
65 70 



125 130 



160 



gtt cgt aag aaa aac agt eta aga ttg gac gat tgg gta ttg tgt cgt 
Val Arg LyI Lys Asn Ser Leu Arg Leu Asp Asp Trp Val Leu Cys Arg 

ata tat aac aag aaa ggt gtc ate gag aag cga cga age gat ate gag 
lie Tyr Asn Lyi Lys Gly Val He Glu Lys Arg Arg Ser Asp lie Glu 



gac ggg tta aag cct gtg act gac acg tgt cca ccg gaa tct gtg gcg 
Asp Gly Leu LyI Pro Val Thr Asp Thr Cys Pro Pro Glu Ser Val Ala 



175 



s a s is m s e s is e s s k » s sj 

190 195 200 



aac aac got egg ttg agt aat gee ctt gat ttt ccg ttt aat tac gta 
III Asn Gly Sg Leu sir Asn Ala Leu Asp Phe Pro Phe Asn Tyr Val 
205 210 . 215 

aat acc ate qcc gat aac gag att gtg tea egg eta ttg ggc ggg aat 
Isp Ala lit Ala Asp Asn Glu lie Val Ser Arg Leu Leu Gly Gly Asn 



210 2 *5 

aac gag att gtg tea 
.„ r Asn Glu He Val Ser 
225 2 30 

caa atq tqq teg acg acg ctt gat cca ctt gtg gtt agg cag gga act 
G?n Met Tr| Ser Thr Thr Leu Asp Pro Leu Val Val Arg Gin Gly Thr 



240 



ttc tga gttgtcacgt gcgattagag ttagtggaaa gtggaaacta tcactgtctg 
Phe 



ttttcgcacg tgteggge 
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ill 



159 



207 



255 



303 



351 



cca aac ggt tea aga ccc aac cgt gca get ggt act gga tat tgg aaa 
Pro Asn Gly Ser Arg Pro Asn Arg Ala Ala Gly Thr Gly Tyr Trp Lys 
B0 85 90 

act acc qga get gat aaa cca ata ggt cgt cct aaa ccg gtt ggt att 
Ala Thr Gly Ala Asp Lys Pro lie Gly Arg Pro Lys Pro Val Gly lie 
95 100 105 

aaa aaa act eta qtg ttt tac teg gga aaa cct cca aat gga gag aaa 
Ly! Lyl Ala Leu Val Phe Tyr Ser Gly Lys Pro Pro Asn Gly Glu Lys 
110 H5 120 

acc aat tqq att atg cac gaa tac egg etc get gac gtt gac egg teg 495 
?hr Asn 1% lie Met His Glu Tyr Arg Leu Ala Asp Val Asp Arg Ser 

gat tgg gta ttg tgt 
Asp Trp Val Leu Cys 
150 155 

aag cga cga age gat 

Lys Arg Arg Ser Asp 

165 170 



399 



447 



543 



591 



639 



687 



735 



783 



831 



887 



905 
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<210> 40 
<211> 253 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 40 

Met Met Lys Ser Gly Ala Asp Leu Gin Phe Pro Pro Gly Phe Arg Phe 
1 5 10 15 

His Pro Thr Asp Glu Glu Leu Val Leu Met Tyr Leu Cys Arg Lys Cys 
20 25 30 

Ala Ser Gin Pro lie Pro Ala Pro He He Thr Glu Leu Asp Leu Tyr 
35 40 45 

Arg Tyr Asp Pro Trp Asp Leu Pro Asp Met Ala Leu Tyr Gly Glu Lys 
50 55 60 

Glu Trp Tyr Phe Phe Ser Pro Arg Asp Arg Lys Tyr Pro Asn Gly Ser 
65 70 75 80 

Arg Pro Asn Arg Ala Ala Gly Thr Gly Tyr Trp Lys Ala Thr Gly Ala 
85 90 95 

Asp Lys Pro He Gly Arg Pro Lys Pro Val Gly He Lys Lys Ala Leu 
100 105 HO 

Val Phe Tyr Ser Gly Lys Pro Pro Asn Gly Glu Lys Thr Asn Trp He 
115 ' 120 125 

Met His Glu Tyr Arg Leu Ala Asp Val Asp Arg Ser Val Arg Lys Lys 
130 * " 135 140 

Asn Ser Leu Arg Leu Asp Asp Trp Val Leu Cys Arg lie Tyr Asn Lys 
145 150 155 160 

Lys Gly Val lie Glu Lys Arg Arg Ser Asp He Glu Asp Gly Leu Lys 
165 170 175 

Pro Val Thr Asp Thr Cys Pro Pro Glu Ser Val Ala Arg Leu He Ser 
180 185 190 

Gly Ser Glu Gin Ala Val Ser Pro Glu Phe Thr Cys Ser Asn Gly Arg 
• 195 200 205 

Leu Ser Asn Ala Leu Asp Phe Pro Phe Asn Tyr Val Asp Ala He Ala 
.210 215 220 

Asp Asn Glu He Val Ser Arg Leu Leu Gly Gly Asn Gin Met Trp Ser 
225 230 235 240 

Thr Thr Leu Asp Pro Leu Val Val Arg Gin Gly Thr Phe 
245 250 



<210> 41 
<211> 1479 
<212> DNA 

<213> Arabidopsis thaliana 
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<220> 

<221> CDS 

<222> (192) ..{962) 

<223> G765 



;;;Uc4» c«c t ««c 

c„c«c«. «tc«c«c , 9 =acca.c 9 t 9 «.t=«9~ gt.a.c.ct .tcc. t c,« 
c^ccctc ,.... 9 t t . t t9 ««o t e 9 ,.*,.««« c« 9 «tc« 9 a t =.. 9 c.t» 

— — 9 s sa is k p. s « e e a a s a; 
s a e a a e a e s « s i * s » s s 

IS K IS S S S 31 E S B - « 55 " S J? 
S III S 'S K E ffi K E E E S E 51 E « 

is s e a e a s b e a » e s a s e 
s s s a s e si 5- - ~ s l?? 2 21 s? K 

80 85 

s k s o s « E s s is s i a c K 

III S S, !!5 5. « SS S - « 3 S E S - £ 

110 115 

is si s m a s e it. e c s s e in e a 

130 1J5 
cat gaa tat egt ett gat g ? a aaa tat tot tat cat aac etc ccc aa. 
His Glu Tyr Ar 9 Leu Asp Gly Lys Tyr ser Tyr ^ 

145 150 
acc gca agg gat gaa tgg gtg gtg tgt agg gtt ttt cac aag «c get 
Thr Ala Arg Asp Glu Trp Val Val cys ^ 
160 165 

IS HI III S SI S S S E E SI S S E E E 

175 180 
ctt gat aac att gat cat etc tta gac ttc tea tct etc eet ect etc 
Leu Asp Asn He Asp His Leu Leu Asp Phe Ser ser Leu JM 
190 195 

£ SI IS ill S 3 IS S E E E E E IS 31! E 
III SI SI S E S SI S E IS IS III E E S S 

225 230 
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60 
120 
180 
230 

278 

326 

374 

422 

470 

518 

566 

614 

662 

710 

758 

806 

854 

902 

950 
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Ser Arg Gin His Leu Pro Ser Tyr Pro Ser Ser Gin Phe Pro Leu Pro 
240 245 250 

etc ggt ccc taa ttceggatet gatttegget aeggggcagg ttcaggcaat 1002 



Leu Gly Pro 
255 












aataacaaag 


gtatgatcaa 


gttggagcat 


tctcttgtga 


g cgcgcctca 


agaaaccyyc 




ttgagttccg 


atgtgaacac 


aaccgcaacg 


ccagagatat 


cttcttatcc 


aatgatgatg 


1122 


aatccggcaa 


tgatggatgg 


tagcaagtca 


gcgtgtgatg 


gtcttgatga 


cttgatcttc 


1182 


tgggaagatt 


tatatactag 


ctaaatttgg 


gaaaaggtta 


tttgttaatt 


gtgattgaag 


1242 


agtggcatat 


tgattactcg 


tctagtgttt 


ttaatcgtgt 


aattagttcg 


tatataatat 


1302 


acatgtacat 


aagatcatta 


ggtttattag 


gcattggact 


ttagttcggt 


gattgettae 


1362 


ctagttttta 


gcttgagaaa 


aaaggctgtc 


attggggtta 


tgtttctttg 


tgattaactt 


1422 


gtacatatat 


acatttaaat 


taaacgtatg 


gtttaaatcg 


tttaaaaaaa 


aaaaaaa 


1479 



<210> 42 
<211> 256 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 42 

Met Val Glu Glu Gly Gly Val Val Val Asn Gin Gly Gly Asp Gin Glu 
15 10 15 

Val Val Asp Leu Pro Pro Gly Phe Arg Phe His Pro Thr Asp Glu Glu 
20 25 30 

lie lie Thr His Tyr Leu Lys Glu Lys Val Phe Asn He Arg Phe Thr 
35 40 45 

Ala Ala Ala He Gly Gin Ala Asp Leu Asn Lys Asn Glu Pro Trp Asp 
50 55 60 

Leu Pro Lys He Ala Lys Met Gly Glu Lys Glu Phe Tyr Phe Phe Cys 
65 70 75 80 

Gin Arg Asp Arg Lys Tyr Pro Thr Gly Met Arg Thr Asn Arg Ala Thr 
85 90 95 

Val Ser Gly Tyr Trp Lys Ala Thr Gly Lys Asp Lys Glu He Phe Arg 
100 105 HO 

Gly Lys Gly Cys Leu Val Gly Met Lys Lys Thr Leu Val Phe Tyr Thr 
115 120 125 

Gly Arg Ala Pro Lys Gly Glu Lys Thr Asn Trp Val Met His Glu Tyr 
130 135 140 

Arg Leu Asp Gly Lys Tyr Ser Tyr His Asn Leu Pro Lys Thr Ala Arg 
145 150 155 160 

Asp Glu Trp Val Val Cys Arg Val Phe His Lys Asn Ala Pro Ser Thr 
165 170 • 175 
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MBI16 Sequence Listing. ST25 
Thr lie Thr Thr Thr Lys Gin Leu Ser Arg He Asp Ser Leu Asp Asn 
180 ieb 

Ue Asp His Leu Leu Asp Phe Ser Ser Leu Pro Pro Leu He Asp Pro 
195 200 

Gly Phe Leu Gly Gin Pro Ala Gin Ala Ser Pro Val Pro val Asn Asn 
210 215 



Thr lie Ser Asn Leu Ser Pro Pro Ser Tyr Asn Arg Thr Ser Arg Gin 
225 230 

His Leu Pro Ser Tyr Pro Ser Ser Gin Phe Pro Leu Pro Leu Gly Pro 



245 250 



<210> 43 

<211> 825 

<212> DNA , 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (1)..(825) 

<223> G197 



pVb S f~ 5 SJ IS S K SB B = 5 f K 

3 s "i e c s e w s «• s - « 5 E 55 

20 25 

k e f s s a - ss ss a ss is f ss s a 
a; s s s?s ss s a S - = ir s 55 s ' 

S 55 Si IS S K S SS C K K S = S S £ 

1 « «c c« «« Mt £ ~9 g| tct ctt «« Jg £. g «J 
Leu Hie Ser Leu Leu Gly Asn Lys Trp » 9S 

85 * w 
cca gga aga aca gat aac gag .» aag aat tac tgg aac aca cat get 
Pro Gly Arg Thr Asp Asn Glu He Lys Asn iyr xrp ^ 
100 10& 

s si s ss s; si s § s is ss is s ss ss 

115 120 

S SS IS S K s k s; s s s s s ss ts s 

130 135 

e as s ss « sj ss ss sss ss § ~ a 25 B » 
s k ss sss §? is is ss ss s s si e is ss ss; 



48 



96 



144 



192 



240 



288 



336 



384 



432 



480 



528 
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MBI16 Sequence Listing. ST25 

gtt gtt gaa gaa aga tgt ctg gac ttg aat ctt gag ctt agg ate agt 576 
Val Val Glu Glu Arg Cys Leu Asp Leu Asn Leu Glu Leu Arg He Ser 
180 185 190 



cca cca tgg caa gac aag etc cat gat gag agg aac eta agg ttt ggg 
Pro Pro Trp Gin Asp Lys Leu His Asp Glu Arg Asn Leu Arg Phe Gly 
195 200 205 



atg aaa tga 
Met Lys 



<210> 44 
<211> 274 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 44 

Met Gly Arg Ser Pro Cys Cys Glu Lys Asp His Thr Asn Lys Gly Ala 
15 10 15 

Trp Thr Lys Glu Glu Asp Asp Lys Leu He Ser Tyr He Lys Ala His 
20 25 3 0 

Gly Glu Gly Cys Trp Arg Ser Leu Pro Arg Ser Ala Gly Leu Gin Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp He Asn Tyr Leu Arg Pro Asp 
50 55 60 

Leu Lys Arg Gly Asn Phe Thr Leu Glu Glu Asp Asp Leu He He Lys 
65 * 70 75 80 

Leu His Ser Leu Leu Gly Asn Lys Trp Ser Leu He Ala Thr Arg Leu 
85 90 95 

Pro Gly Arg Thr Asp Asn Glu He Lys Asn Tyr Trp Asn Thr His Val 
100 105 110 

Lys Arg Lys Leu Leu Arg Lys Gly He Asp Pro Ala Thr His Arg Pro 
115 120 125 

He Asn Glu Thr Lys Thr Ser Gin Asp Ser Ser Asp Ser Ser Lys Thr 
130 135 140 

Glu Asp Pro Leu Val Lys He Leu Ser Phe Gly Pro Gin Leu Glu Lys 
145 ISO 155 160 
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624 



aga gtg aag tat agg tgc agt gcg tgc cgt ttt gga ttc ggg aac ggc 672 

Arg Val Lys Tyr Arg Cys Ser Ala Cys Arg Phe Gly Phe Gly Asn Gly 
210 215 220 

aag gag tgt age tgt aat aat gtg aaa tgt caa aca gag gac agt agt 720 

Lys Glu Cys Ser Cys Asn Asn Val Lys Cys Gin Thr Glu Asp Ser Ser 

225 230 235 240 

age age agt tat tct tea acc gac att agt agt age att ggt tat gac 768 

Ser Ser Ser Tyr Ser Ser Thr Asp He Ser Ser Ser He Gly Tyr Asp 

245 250 255 

ttc ttg ggt eta aac aac act agg gtt ttg gat ttt age act ttg gaa 816 

Phe Leu Gly Leu Asn Asn Thr Arg Val Leu Asp Phe Ser Thr Leu Glu 

260 265 270 



825 



WO 01/36598 



PCTAJS00/31458 



MBI16 Sequence Listing .ST2 5 

lie Ala Asn Phe Gly Asp Glu Arg lie Gin Lys Arg Val Glu Tyr Ser 
165 I 70 175 

Val Val Glu Glu Arg Cys Leu Asp Leu Asn Leu Glu Leu Arg lie Ser 
180 I8 5 190 

Pro Pro Trp Gin Asp Lys Leu His Asp Glu Arg Asn Leu Arg Phe Gly 
195 200 205 

Arg Val Lys Tyr Arg Cys Ser Ala Cys Arg Phe Gly Phe Gly Asn Gly 
210 215 220 

Lys Glu Cys Ser Cys Asn Asn Val Lys Cys Gin Thr Glu Asp Ser Ser 
225 230 235 240 

Ser Ser Ser Tyr Ser Ser Thr Asp lie Ser Ser Ser lie Gly Tyr Asp 
245 250 *=>5> 

Phe Leu Gly Leu Asn Asn Thr Arg Val Leu Asp Phe Ser Thr Leu Glu 
260 265 270 



Met Lys 



<210> 45 

<211> 918 

<212> DNA 

<213> Arabidopsis thalina 
<220> 

<221> CDS 

<222> (30).. (839) 

<223> G255 



^gcatcatca tcatcagaag aagagagtc atg gga aga tct cct tgc tgc gag 

1 5 



20 

cac ggt gaa ggt tgt tgg cga tct ctt 

His Gly Glu Gly Cys Trp Arg Ser Leu 
35 40 

cgc tgc ggt aaa age tgc cgt ctt cgc 
Arg Cys Gly Lys Ser Cys Arg Leu Arc 
50 55 

gat etc aaa aga gga aac ttt aca cat 
Asp Leu Lys Arg Gly Asn Phe Thr His 
65 70 

aag ctt cat age etc eta ggc aac aac 
Lys Leu His Ser Leu Leu Gly Asn Ly£ 
80 85 

tta cct gga aga aca gat aac gag at< 
Leu Pro Gly Arg Thr Asp Asn Glu Il< 
100 

ata aag agg aag ctt ttg age aaa gg< 
Page 49 



aaa gaa 
Lys Glu 
10 


cac 
His 


atg 
Met 


aac 
Asn 


aaa 
Lys 


ggt 
Gly 
15 


eta gtc 
Leu Val 
25 


tct 
Ser 


tac 
Tyr 


ate 
He 


aag 
Lys 
30 


tct 
Ser 


cct aga 
Pro Arg 


gee 
Ala 


get 
Ala 


ggt 
Gly 
45 


etc 
Leu 


ctt 
Leu 


tgg att 
Trp lie 


aac 
Asn 


tat 
Tyr 
60 


etc 
Leu 


cga 
Arg 


cct 
Pro 


gat gaa 
Asp Glu 


gat 

Asp 
75 


gaa 
Glu 


ctt 
Leu 


ate 
He 


ate 
He 


tgg tct 
Trp Ser 
90 


ttg 
Leu 


att 
lie 


gcg 
Ala 


gcg 
Ala 


aga 
Arg 
95 


aag aac 


tac 


tgg 


aac 


aca 


cat 



53 



101 



149 



197 



245 



293 



341 



389 
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MBI16 Sequence Listing. ST25 

Lys Asn Tyr Trp Asn Thr His lie Lys Arg Lys Leu Leu Ser Lys Gly 
105 110 115 120 

att gat cca gcc act cat aga ggg ate aac gag gca aaa att tct gat 437 

He Asp Pro Ala Thr His Arg Gly He Asn Glu Ala Lys He Ser Asp 
125 130 135 

ttg aag aaa aca aag gac caa att gta aaa gat gtt tct ttt gtg aca 485 

Leu Lys Lys Thr Lys Asp Gin He Val Lys Asp Val Ser Phe Val Thr 
140 145 150 

aag ttt gag gaa aca gac aag tct ggg gac cag aag caa aat aag tat 533 

Lys Phe Glu Glu Thr Asp Lys Ser Gly Asp Gin Lys Gin Asn Lys Tyr 

155 160 165 

att cga aat ggg tta gtt tgc aaa gaa gag aga gtt gtt gtt gaa gaa 581 

He Arg Asn Gly Leu Val Cys Lys Glu Glu Arg Val Val Val Glu Glu 
170 * 175 180 

aaa ata ggc cca gat ttg aat ctt gag ctt agg ate agt cca cca tgg 629 

Lys He Gly Pro Asp Leu Asn Leu Glu Leu Arg He Ser Pro Pro Trp 
185 * * 190 195 200 

caa aac cag aga gaa ata tct act tgc act gcg tec cgt ttt tac atg 677 

Gin Asn Gin Arg Glu He Ser Thr Cys Thr Ala Ser Arg Phe Tyr Met 
205 210 215 

gaa aac gac atg gag tgt agt agt gaa act gtg aaa tgt caa aca gag 725 

Glu Asn Asp Met Glu Cys Ser Ser Glu Thr Val Lys Cys Gin Thr Glu 
220 225 230 

aat agt age age att age tat tct tct att gat att agt agt agt aac 773 

Asn Ser Ser Ser He Ser Tyr Ser Ser He Asp lie Ser Ser Ser Asn 

235 240 245 

gtt ggt tat gac ttc ttg ggt ttg aag aca aga att ttg gat ttt cga 821 

Val Gly Tyr Asp Phe Leu Gly Leu Lys Thr Arg He Leu Asp Phe Arg 
250 255 260 



age ttg gaa atg aaa taa atgaatagta ttagattctt aatttgtagg 

Ser Leu Glu Met Lys 

265 



869 



tctgataatg aatgttagat tcgcggccct ctagacaggc ctcgtaccg 918 

<210> 46 
<211> 269 
<212> PRT 

<213> Arabidopsis thalina 
<400> 46 

Met Gly Arg Ser Pro Cys Cys Glu Lys Glu His Met Asn Lys Gly Ala 
15 10 15 

Trp Thr Lys Glu Glu Asp Glu Arg Leu Val Ser Tyr He Lys Ser His 
20 25 30 

Gly Glu Gly Cys Trp Arg Ser Leu Pro Arg Ala Ala Gly Leu Leu Arg 
35 40 45 

Cys Gly Lys Ser Cys Arg Leu Arg Trp He Asn Tyr Leu Arg Pro Asp 
50 55 ' 60 

Leu Lys Arg Gly Asn Phe Thr His Asp Glu Asp Glu Leu He lie Lys 
65 70 75 80 

Leu His Ser Leu Leu Gly Asn Lys Trp Ser Leu lie Ala Ala Arg Leu 
85 90 95 
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MBI16 Sequence Listing.ST25 

Pro Gly Arg Thr Asp Asn Glu lie Lys Asn Tyr Trp A S n Thr His lie 
100 105 

Lys Arg Lys Leu Leu Ser Lys Gly He Asp Pro Ala Thr His Arg Gly 
115 120 

lie Asn Glu Ala Lys He Ser Asp Leu Lys Lys Thr Lys Asp Gin lie 
130 135 

val Lys Asp val Ser Phe Val Thr Lys Phe Glu Glu Thr Asp Lys Ser 
145 ^0 



Gly Asp Gin Lys Gin Asn Lys Tyr He Arg Asn Gly Leu Val Cys Lys 

He 
Asn 

Cys Thr Ala Ser Arg Phe Tyr Mec Glu Asn Asp Met Glu Cys Ser Ser 

210 215 

Glu Thr Val Lys Cys Gin Thr Glu Asn Ser Ser Ser lie Ser Tyr Ser 
225 230 



165 

Glu Glu Arg Val Val Val Glu Glu Lys lie Gly Pro Asp Leu Asn Leu 
180 185 

Glu Leu Arg lie Ser Pro Pro Trp Gin Asn Gin Arg Glu He Ser Thr 

- ~ r- 200 



195 



Ser lie Asp He Ser Ser Ser Asn Val Gly Tyr Asp Phe Leu Gly Leu 
Lys Thr Arg lie Leu Asp Phe Arg Ser Leu Glu Met Lys 



260 



<210> 47 

<211> 660 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (48).. (521) 

<223> G1113 



lagctttatc tetttetttc tctcctctct atctttctct cacactc atg ggt etc 

1 

s ss is as k e s is s a « s e s k s 
ss s ts s e s £ s «s a e e e s & s 

£ K IS S S S SS S B 5" S K S - 5 - 

40 45 
gac ccg acc cga etc get etc tec acg tea gca act ctt gee aac gag 
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MBI16 Sequence Listing .ST25 
Asp Pro Thr Arg Leu Ala Leu Ser Thr Ser Ala Thr Leu Ala Asn Glu 
55 60 65 

tta ate ccg gtg gtt cgt ttc tec gat ctt tta ace gat ccg gaa gat 296 
Leu lie Pro Val Val Arg Phe Ser Asp Leu Leu Thr Asp Pro Glu Asp 
70 75 80 

tgc tgc acg gtt tgc tta tec gat ttt gta tec gac gat aag. att aga . 344 
Cys Cys Thr Val Cys Leu Ser Asp Phe Val Ser Asp Asp Lys He Arg 
85 90 95 

cag ctg ccg aag tgt gga cac gtg ttt cat cat cgt tgt tta gac cgt 392 
Gin Leu Pro Lys Cys Gly His Val Phe His His Arg Cys Leu Asp Arg 
100 ' 105 HO 115 



tgg ate gtt gac tgt aat aag ata acg tgc ccg att tgt egg aac egg 
Trp He Val Asp Cys Asn Lys He Thr Cys Pro He Cys Arg Asn Arg 
120 125 130 



440 



ttc tta ccg gag gaa aag tec acg ccg ttt gat tgg ggt act tea gat 4 88 

Phe Leu Pro Glu Glu Lys Ser Thr Pro Phe Asp. Trp Gly Thr Ser Asp 
135 140 145 

tgg ttt aga gat gaa gtg gag agt ace aac taa taatgatggt tttactttta 541 
Trp Phe Arg Asp Glu Val Glu Ser Thr Asn 
150 155 

ctttttactt ttttcacggt aatatttttc tactgtataa ttctttcttc caaactactg 601 

tataattcaa gtataagatt atgtaattgt gtatattagc atcaatcatc tttctttgt 660 

<210> 48 

<211> 157 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 48 

Met Gly Leu Pro Thr Asp Phe Lys Glu Leu Gin He Pro Gly Tyr Val 
1 5 10 15 

Leu Lys Thr Leu Tyr Val He Gly Phe Phe Arg Asp Met Val Asp Ala 
20 25 30 

Leu Cys Pro Tyr He Gly Leu Pro Ser Phe Leu Asp His Asn Glu Thr 
35 40 45 

Ser Arg Ser Asp Pro Thr Arg Leu Ala Leu Ser Thr Ser Ala Thr Leu 
50 55 60 

Ala Asn Glu Leu He Pro Val Val Arg Phe Ser Asp Leu Leu Thr Asp 
65 70 75 80 

Pro Glu Asp Cys Cys Thr Val Cys Leu Ser Asp Phe Val Ser Asp Asp 
85 90 95 

Lys He Arg Gin Leu Pro Lys Cys Gly His Val Phe His His Arg Cys 
100 105 110 

Leu Asp Arg Trp He Val Asp Cys Asn Lys He Thr Cys Pro He Cys 
115 120 . 125 

Arg Asn Arg Phe Leu Pro Glu Glu Lys Ser Thr Pro Phe Asp Trp Gly 
130 135 140 
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MBI16 Sequence Listing. ST25 
Thr Ser Asp Trp Phe Arg Asp Glu Val Glu Ser Thr Asn 



145 



150 



<210> 
<211> 
<212> 



49 

1201 
DNA 



<213> Arabidopsis thaliana 



<220> 
<221> 
<222> 
<223> 



CDS 

(148) . . (996) 
G398 



a^aaggtttc tcttgtcctc catacactta gcacaactga taaatctttt gaggtaaaat 

cagctttaga tcaaggtttt tctagtcatc tctactcata aagatcaaag cttttgctat 

tctcattttc taccaagaga caatatc atg atg atg ggt aaa gag gat ttg ggt 

1 5 

tta aat ctt age ttg gga ttt gca caa aac cat cct etc cag eta aat 
Leu llr Leu ITr Leu Gly Phe Ala Gin Asn His Pro Leu Gin Leu Asn 
10 15 20 

ctt aaa ccc act tct tea cca atg tec aat etc cag atg ttt cca tgg 
Leu Lys Pro Thr Ser Ser Pro Mel Ser Asn Leu Gin Met Phe Pro Trp 
30 35 40 

aar caa acc ctt qtt tct tec tea gat caa caa aag caa cag ttt ctt 
Asn Gin Thr Leu Val Ser Ser Ser Asp Gin Gin Lys Gin Gin Phe Leu 
45 50 55 

aaa aaa ate gac gtg aac age ttg cca aca acg gtg gat ttg gaa gag 
Arg Lys ?le Asp Val Asn Ser Leu Pro Thr Thr Val Asp Leu Glu Glu 
60 65 70 

gag aca gga gtt teg tct cca aac agt acg ate teg age aca gtg agt 
Glu Thr Gly Val Se? Ser Pro Asn Ser Thr He Ser Ser Thr Val Ser 
75 80 85 

gqa aag agg agg agt act gaa aga gaa ggt acc, tec ggt ggt ggt tgc 
Gly Lyl A?g Ar^ sir Thr Glu Arg Glu Gly Thr Ser Gly Gly Gly Cys 
90 95 100 

qga gat gac ctt gac ate act eta gat aga tct tec tea cgt gga acc 
l!y Lp Asp Leu Asp He Thr Leu Asp Arg Ser Ser Ser Arg Gly Thr 
110 115 xzu 

tec gat gaa gag gaa gat tac gga ggt gag act tgt agg aag aag ctt 
Ser Isp Glu Glu Glu Asp Tyr Gly Gly Glu Thr Cys Arg Lys Lys Leu 
125 I 30 135 

aaa eta tec aaa gat caa tec gca gtt etc gaa gac act ttc aaa gag 
Irg Leu Ser Lys Asp Gin Ser Ala Val Leu Glu Asp Thr Phe Lys Glu 
140 145 15 

cac aat act etc aat ccc aaa cag aag ctg get ttg get aag aag eta 
His Asn Thr Leu Asn Pro Lys Gin Lys Leu Ala Leu Ala Lys Lys Leu 
155 160 165 

ggt tta aca gca aga caa gtg gaa gtg tgg ttc caa aac aga aga gca 
Gly Leu Thr Ala Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg Ala 
170 175 180 ldS 

agg aca aag tta aag cag acc gaa gtg gat tgc gag tat ttg aaa aga 
S| Thr LyI Leu LyS Gin Thr Glu Val Asp Cys Glu Tyr Leu Lys Arg 
190 195 

tgt gtt gag aaa tta acg gaa gag aat egg egg etc gag aaa gag gca 
Cvs Val Glu Lys Leu Thr Glu Glu Asn Arg Arg Leu Glu Lys Glu Ala 
y 205 210 215 
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120 
174 

222 

270 

318 

366 

414 

462 

510 

558 

606 

654 

702 

750 

798 
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MBI16 Sequence Listing. ST25 

gcg gaa eta aga gca tta aag ctt tea ccg egg ttg tat ggt cag atg 846 
Ala Glu Leu Arg Ala Leu Lys Leu Ser Pro Arg Leu Tyr Gly Gin Met 
220 225 230 

agt cca ccg ace aca ctt ttg atg tgt cca teg tgt gaa cgt gtg gee 894 
Ser Pro Pro Thr Thr Leu Leu Met Cys Pro Ser Cys Glu Arg Val Ala 
235 240 245 

gga cca tec tea tct aac cac aac cag cga tct gtc tea ttg agt cca 942 
Gly Pro Ser Ser Ser Asn His Asn Gin Arg Ser Val Ser Leu Ser Pro 
250 255 260 265 

tgg etc caa atg gee cat ggg tea acc ttt gat gtg atg cgt cct agg 990 
Trp Leu Gin Met Ala His Gly Ser Thr Phe Asp Val Met Arg Pro Arg 
270 275 280 

tct taa etttaatget gcttctatgg gttgtgtgtg ggtcattgta ctttttagat 1046 
Ser 

tattgactct cagctaatgt atccttaaaa gectttttet acttttaaat ttactttaat 1106 

ctaattaaat tagttgtcca tgtcttcttg ataacaaaaa aatttataat tataaaaaaa 1166 

aaaaacagga taaaaaaaaa aaaaaaaaaa aaaaa 1201 

<210> 50 

<211> 282 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 50 

Met Met Met Gly Lys Glu Asp Leu Gly Leu Ser Leu Ser Leu Gly Phe 
1 5 10 15 

Ala Gin Asn His Pro Leu Gin Leu Asn Leu Lys Pro Thr Ser Ser Pro 
20 25 30 

Met Ser Asn Leu Gin Met Phe Pro Trp Asn Gin Thr Leu Val Ser Ser 
35 40 45 

Ser Asp Gin Gin Lys Gin Gin Phe Leu Arg Lys He Asp Val Asn Ser 
50 55 60 

Leu Pro Thr Thr Val Asp Leu Glu Glu Glu Thr Gly Val Ser Ser Pro 
65 70 75 80 

Asn Ser Thr lie Ser Ser Thr Val Ser Gly Lys Arg Arg Ser Thr Glu 
85 90 95 

Arg Glu Gly Thr Ser Gly Gly Gly Cys Gly Asp Asp Leu Asp He Thr 
100 105 110 

Leu Asp Arg Ser Ser Ser Arg Gly Thr Ser Asp Glu Glu Glu Asp Tyr 
115 120 125 

Gly Gly Glu Thr Cys Arg Lys Lys Leu Arg Leu Ser Lys Asp Gin Ser 
130 * 135 140 

Ala Val Leu Glu Asp Thr Phe Lys Glu His Asn Thr Leu Asn Pro Lys 
145 150 155 160 
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MBI16 Sequence Listing. ST25 
Gl „ Ly s Leu Ala Leu Ala Lys Lys Leu fly Leu Thr Ala Arg Gin Val 
165 1/u 



Glu val Trp Phe Gin Asn Arg Arg Ala Arg Thr Lys Leu Lys Gin Thr 
180 10D 

Glu val Asp Cys Glu Tyr Leu Lys Arg Cys Val Glu Lys Leu Thr Glu 
195 200 

Glu Asn Arg Arg Leu Glu Lys Glu Ala Ala Glu Leu Arg Ala Leu Lys 
210 215 

L eu Ser Pro Arg Leu Tyr Gly Gin Met Ser Pro Pro Thr Thr Leu Leu 
225 230 

Met Cys Pro Ser Cys Glu Arg Val Ala Gly Pro Ser Ser Ser Asn His 
A sn Gin Arg Ser Val Ser Leu Ser Pro Trp Leu Gin Met Ala His Gly 



260 265 



Ser Thr Phe Asp Val Met Arg Pro Arg Ser 
275 280 



<210> 51 

<211> 937 

<212> DNA . 

<213> Arabidopsis thaliana 

<220> 

<221> CDS 

<222> (120) (797) 

<223> G395 



Ec»Ec .«««tto gc., 9 .=c« t c«e t , C « .a9i.. 9 «< ««c,tt 3 
s „.99.c=c g ocg.c.«c t9 «t«„c «, t ct 9gt . ccccg.c 99 .99.»9» 

s s ss a; s e s e e e b e s b e ss 
r. k o s Is a is e s a « s - !; ; 3 a 

20 25 

S 5S SS 2! B £ W "5 SS SS E « E B E B 

35 40 

sis e ts s e s k e k a e s ss; k s s 
IS SIS sss its E e e e b e e e e e e sss 

SS E S E E E a S E £ E B E K » S 
IK E ffi K E E E S B 5? E S SS2 E E E 

100 105 



60 
119 
167 

215 

263 

311 

3S9 

407 

455 
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ttg aat ctt cgt cct cgt cag gtt gaa gtc tgg ttt caa aac aga cga 

Leu Asn Leu Arg Pro Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg 
115 120 125 

gcc agg aca aag ctg aag caa acg gaa gtg gac tgt gaa tac eta aag 

Ala Arg Thr Lys Leu Lys Gin Thr Glu Val Asp Cys Glu Tyr Leu Lys 
130 135 140 

aga tgc tgt gag tea eta acc gaa gaa aac egg agg ctt caa aaa gag 

Arg Cys Cys Glu Ser Leu Thr Glu Glu Asn Arg Arg Leu Gin Lys Glu 

145 150 155 160 

gtt aaa gaa ttg aga acc ttg aag act tec aca ccc ttt tac atg caa 

Val Lys Glu Leu Arg Thr Leu Lys Thr Ser Thr Pro Phe Tyr Met Gin 
165 * 170 175 



503 



551 



599 



647 



ctt ccg gcc act act etc act atg tgc cct tct tgt gaa cgt gtt gcc 695 
Leu Pro Ala Thr Thr Leu Thr Met Cys Pro Ser Cys Glu Arg Val Ala 
180 185 190 

act tea gca gca cag ccc tec acg tea get gcc cac aac etc tgt ttg 743 
Thr Ser Ala Ala Gin Pro Ser Thr Ser Ala Ala His Asn Leu Cys Leu 
195 200 205 

tec acg tea tea ttg att ccg gtt aag cct egg ccg gcc aaa caa gtt 791 
Ser Thr Ser Ser Leu He Pro Val Lys Pro Arg Pro Ala Lys Gin Val 
210 215 220 

tea tga aagcacctgc gaaatacagt ttgagcaaac gggcggccgc tctagacagg 847 

Ser 

225 

cctcgtaccg gatcctctag ctagagcttt cgttcgtatc ateggttteg acaaegtteg 907 
tcaagttcaa tgacatcagt ttgattgege 937 

<210> 52 

<211> 225 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 52 

Met Pro Leu Gly Ala Ala Thr Val Val Glu Glu Glu Glu Glu Glu Glu 
1-5 10 15 

Glu Ala Val Pro Ser Met Ser Val Ser Pro Pro Asp Ser Val Thr Ser 
20 25 30 

Ser Phe Gin Leu Asp Phe Gly He Lys Ser Tyr Gly Tyr Glu Arg Arg 
35 40 45 

Ser Asn Lys Arg Asp lie Asp Asp Glu Val Glu Arg Ser Ala Ser Arg 
50 55 60 

Ala Ser Asn Glu Asp Asn Asp Asp Glu Asn Gly Ser Thr Arg Lys Lys 
65 70 75 80 

Leu Arg Leu Ser Lys Asp Gin Ser Ala Phe Leu Glu Asp Ser Phe Lys 
85 90 95 

Glu His Ser Thr Leu Asn Pro Lys Gin Lys He Ala Leu Ala Lys Gin 
100 105 HO 

Leu Asn Leu Arg Pro Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg 
115 120 125 
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130 135 
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145 



« ,,. «. « - - "» 55 ™ ~ "* ** !S <sl " 

165 

«. Pro M > » « - - « SB P " s " c " "* » "* 

180 



^ c«r ala Ala His Aen Leu Cys Leu 
Ser Ala Ala Gin Pro Ser Thr Ser Ala Ala ^ 

195 ZUU 

_ Tlp Pro val Lys Pro Arg Pro Ala hys Gin Val 
Thr ser Ser Leu He Pro vax ny 22Q 



Ser iu* — 215 

210 



Ser 
225 



<210> 53 

<211> 927 

<llll Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (37).. (861) 

<223> G393 



SKtJL ggcatatttt ttfttto. actcag atg ggt ttt gat gat a« 

1 5 

s s s % a e a « s « « a s s ss s 
s s s s s s s s s e s 5 s « ™ 

IS S3 S S 3! S S 85 B S S § S « «? S 

40 45 

a; k b a = '§ a s as s sr. s a; e s. 
s ss s s f a a e s 5 « a b & f « 
e a s is ss k b ssi f e a ss s § b 5S 
s e B k r. « s if; s s s g s - a 

105 

„. c. c„ • 9 « « « « - « * ~ 



54 



102 



150 



198 



342 



390 



438 
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MBI16 Sequence Listing. ST25 
Lys Gin Gin Ser Ala Leu Leu Glu Glu Ser Phe Lys Asp His Ser Thr 
120 125 130 

ctt aat ccc aaa caa aag caa gtt ctg get aga cag ctg aat eta agg 
Leu Asn Pro Lys Gin Lys Gin Val Leu Ala Arg Gin Leu Asn Leu Arg 
135 140 145 150 



gat gga agt acg gee aaa gga get ttc tct ate tec tea aag cct cac 
Asp Gly Ser Thr Ala Lys Gly Ala Phe Ser He Ser Ser Lys Pro His 
250 ' 255 260 



486 



cct aga caa gtt gaa gta tgg ttt caa aat aga aga gee agg aca aag 534 

Pro Arg Gin Val Glu Val Trp Phe Gin Asn Arg Arg Ala Arg Thr Lys 
155 160 165 

ctg aag caa aca gaa gta gat tgt gag ttt ttg aag aag tgt tgt gaa 582 

Leu Lys Gin Thr Glu Val Asp Cys Glu Phe Leu Lys Lys Cys Cys Glu 
170 175 180 

aca tta gca gat gag aac ata aga ctt cag aaa gag att caa gaa etc 630 

Thr Leu Ala Asp Glu Asn He Arg Leu Gin Lys Glu He Gin Glu Leu 

185 190 195 

aaa ace eta aaa ttg act cag ccc ttt tac atg cac atg cct gca teg 678 

Lys Thr Leu Lys Leu Thr Gin Pro Phe Tyr Met His Met Pro Ala Ser 
200 205 210 

act eta acg aag tgt cct tct tgt gag aga ate ggc ggc ggc ggc ggg 726 

Thr Leu Thr Lys Cys Pro Ser Cys Glu Arg lie Gly Gly Gly Gly Gly 
215 220 225 230 

ggt aat gga gga gga ggt ggc ggc age ggg get acc gcg gtg att gta 774 

Gly Asn Gly Gly Gly Gly Gly Gly Ser Gly Ala Thr Ala Val He Val 
235 240 245 



822 



ttc ttc aac cct ttt act aac cca tct gca get tgt tga atagttaatt 871 
Phe Phe Asn Pro Phe Thr Asn Pro Ser Ala Ala Cys 
265 270 

cgtttaattt tattacttaa aatattaatt ttcttttttt ttttgggtgg catttt 927 

<210> 54 

<211> 274 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 54 

Met Gly Phe Asp Asp Thr Cys Asn Thr Gly Leu Val Leu Gly Leu Gly 
1 5 10 15 



Pro Ser Pro He Ser Asn Asn Tyr Asn Ser Thr He Arg Gin Ser Ser 
20 25 30 

Val Tyr Lys Leu Glu Pro Ser Leu Thr Leu Cys Leu Ser Gly Asp Pro 
35 40 45 

Ser Val Thr Val Val Thr Gly Ala Asp Gin Leu Cys Arg Gin Thr Ser 
50 55 60 

Ser His Ser Gly Val Ser Ser Phe Ser Ser Gly Arg Val Val Lys Arg 
65 70 75 80 

Glu Arg Asp Gly Gly Glu Glu Ser Pro Glu Glu Glu Glu Met Thr Glu 
85 90 95 



Arg Val He Ser Asp Tyr His Glu Asp Glu Glu Gly He Ser Ala Arg 
100 105 110 
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iv- Thr Leu Lys Leu Thr Gin Pro Phe Tyr 
Lys Glu lie Gin Glu Leu Lys Thr Leu Ly 2Q5 
195 

Thr Lys eye Pro Ser Cys Glu Arg 



„et His Met Pro Ala Ser Thr Leu Thr uy, ^ ~ 



210 



« «, «, « « a « * ffl> s ,01v Gly w s " » 

225 

u , . ». v.. a. v.. - « - - "* W ' ° 1V S ~ 

245 

„ s„ « a- » - « - 5s p " phe ™ K s " *" 



Ala Cys 
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